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REGULATION OF GENE EXPRESSION IN PLANTS 

This invention relates to methods of modulating 
the expression of desired genes in plants, and to DNA 
5 sequences and genetic constructs for use in these methods. 
In particular, the invention relates to methods and 
constructs for targeting of expression specifically to the 
endosperm of the seeds of cereal plants such as wheat, and 
for modulating the time of expression in the target tissue. 

10 This is achieved by the use of promoter sequences from 

enzymes of the starch biosynthetic pathway. In a preferred 
embodiment of the. invention, the sequences and/or promoters 
are those of starch branching enzyme I, starch branching 
enzyme II, soluble starch synthase I, and starch debranching 

15 enzyme, all derived from Triticum tauschii , the D genome 
donor of hexaploid bread wheat. 

A further preferred embodiment relates to a method 
of identifying variations in the characteristics of plants. 



2 0 BACKGROUND OF THE INVENTION 

Starch is an important constituent of cereal 
grains and of flours, accounting for about 65-67% of. the 
weight of the grain at maturity. It is produced in the 
amyloplast of the grain endosperm by the concerted action of 

25 a number of enzymes, including ADP-Glucose pyrophosphorylase 
(EC 2.7.7.27), starch synthases (EC 2.4.1.21), branching 
enzymes (EC 2.4.1.18) and debranching enzymes (EC 3.2.1.41 
and EC 3.2.1.68) (Ball et al, 1996; Martin and Smith, 1995; 
Morell et al, 1995). Some of the proteins involved in the 

30 synthesis of starch can be recovered from the starch 
granule (Denyer et al, 1995; Rahman et al, 1995). 

Most wheat cultivars normally produce starch 
containing 25% amylose and 75% amylopectin. Amylose is 
composed of large linear chains of a (1-4) linked a-D- 

3 5 glucopyranosyl residues, whereas amylopectin is a branching 

form of ot-glycan linked by a (1-6) linkages. The ratio of 
amylose and amylopectin, the branch chain length and the 



BNSDOCID- <WO 9914314A1 J_> 



WO 99/14314 

PCT/AU98/00743 
- 2 - - 

number of branch chains of amylopectin are the major factors 
which determine the properties of wheat starch. 

Starch with various properties has been widely 
used in industry, food science and medical science. High 
5 amylose wheat can be used for plastic substitutes and in 
paper manufacture to protect the environment; in health 
foods to reduce bowel cancer and heart disease; and in 
sports foods to improve the athletes' performance. High 
amylopectin wheat may be suitable for Japanese noodles, and 
is used as a thickener in the food industry. 

Wheat contains three sets of chromosomes (A, B and 
D) in its very large genome of about 10 10 base pairs (bp) 
The donor of the D genome to wheat is Triticum tauschii, and 
by usxng a suitable accession of this species the genes from 
the D genome can be studied separately (Lagudah et al , 



10 



15 



1991) 



There is comparatively little variation in starch 
structure found in wheat varieties, because the hexaploid 
nature of wheat prevents mutations from being readily 

20 identified. Dramatic alterations in starch structure are 

expected to require the combination of homozygous recessive 
alleles from each of the 3 wheat genomes, A, B and D. This 
requirement renders the probability of finding such. mutants 
in natural or mutagenised populations of wheat very low 

25 Variation in wheat starch is desirable in order to enable 
better tailoring of wheat starches for processing and end- 
user requirements . 

Key commercial targets for the manipulation of 
starch biosynthesis are: 

30 • 1 - "Waxy" wheats in which amylose content is 

decreased to insignificant levels. This outcome is expected 

-_^. b A^btained^y_eli m ^^ 

activity. 

2- High amylose wheats, expected to be obtained 
35 by suppressing starch branching enzyme-II activity. 

3 - Wheats which continue to synthesise starch 
at elevated temperatures, expected to be obtained by 
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identifying or introducing a gene encoding a heat-stable 
soluble starch synthase. 

4. "Sugary types" of wheat which contain 
increased amylose content and free sugars, expected to be 
5 obtained by manipulating an isoamylase- type debranching 
enzyme . 

There are two general strategies which may be used 
to obtain wheats with altered starch structure: 

(a) using genetic engineering strategies to 

10 suppress the activity of a specific gene, or to introduce a 
novel gene into a wheat line; and 

(b) selecting among existing variation in wheat for 
missing ("null") or altered alleles of a gene in 
each of the genomes of wheat, and combining 

15 these by plant breeding. 

However, in view of the complexity of the gene families, 
particularly starch branching enzyme I (SBE I) , without the 
ability to target regions which are unique to genes 
expressed in endosperm, modification of wheat by combination 

20 of null alleles of several enzymes in general represents an 
almost impossible task . 

Branching enzymes are involved in the production 
of glucose a-1,6 branches. Of the two main constituents of 
starch, amylose is essentially linear, but amylopectin is 

25 highly branched; thus branching enzymes are thought to be 
directly involved in the synthesis of amylopectin but not 
amylose. There are two types of branching enzymes in plants 
, starch branching enzyme I (SBE I) and starch branching 
enzyme II (SBE II), and both are about 85 kDa in size. At 

30 the nucleic acid level there is about 65% sequence identity 
between types I and II in the central portion of the 
molecules; the sequence identity between SBE I from 
different cereals is , about 85% overall (Burton et al, 1995; 
Morell et al , 1995) . 

35 In cereals, SBE I genes have so far been reported 

only for rice (Kawasaki et al, 1991; Rahman' et al, 1997). A 
cDNA sequence for wheat SBE I is available on the GenBank 
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database (Accession No. Y12320; Repellin A., Nair R.B., Baga 
M., and Chibbar r.n. : Plant Gene Register PGR97-094, 1997). 
As far as we are aware, no promoter sequence for wheat SBE I 
has been reported. 

We have characterised an SBE I gene, designated 
wSBE I-D2, from Triticum tauschii, the donor of the D genome 
to wheat (Rahman et al, 1997). This gene encoded a protein 
sequence which had a deletion of approximately 65 amino 
acids at the C-terminal end, and appeared not to contain 
some of the conserved amino acid motifs characteristic of 
this class of enzyme (Svensson, 1994). Although wSBE I-D2 
was expressed as mRNA, no corresponding protein has yet been 
found in our analysis of SBE I isoforms from the endosperm 
and thus it is possible that this gene is a transcribed 
15 pseudogene. 

Genes for SBE II are less well characterised; no 
genomic sequences are available, although SBE II cDNAs from 
rice (Mizuno et al, 1993; Accession No. D1S201) and maize 
(Fisher et al, 1993; Accession No. L08065) have been 
20 reported. m addition, a cDNA sequence for SBE -II from 
wheat is available on the GenBank database (Nair et al, 
1997; Accession No. Y11282); although the sequences are very 
similar to those reported herein, there are differences near 
the N-terminal of the protein, which specifies its 
intracellular location. No promoter sequences have been 
reported, as far as we are aware. 

Wheat granule-bound starch synthase (GBSS) is 
responsible for amylose synthesis, while wheat branching 
enzymes together with soluble starch synthases are 
considered to be directly involved in amylopectin 
biosynthesis. A number of isoforms of soluble and granule- 
bound starch s yn^s.es_have-been^dent^ed-in-deve±o P ±ng-— 
wheat endosperm (Denyer et al, l 995 ). There are three 
distinct isoforms of starch synthases, 60 kDa, 75-77 kDa and 
100-105 kDa, which exist in the starch granules (Denyer et 
al, 1995; Rahman et al, 1995). The 60 kDa GBSS is the 
product of the wx gene. The 75-77 kDa protein is a wheat 



25 



30 



35 



BNSOOCID: <WO 9914314A1_L> 



WO 99/14314 PCT/AU98/00743 

- 5 - 

soluble starch synthase I (SSSI) which is present in both 
the soluble fraction and the starch granule-bound fraction 
of the endosperm. However, the 100-105 kDa proteins, which 
are another type of soluble starch synthase, are located 
5 only in starch granules (Denyer et al, 1995; Rahman et al, 
199 5) . To our knowledge there has been no report of any 
complete wheat SSS I sequence, either at the protein or the 
nucleotide level . 

Both cDNA and genomic DNA encoding a soluble 
10 starch synthase I of rice have been cloned and analysed 

(Baba et al, 1993; Tanaka et al, 1995). The cDNAs encoding 
potato soluble starch synthase SSSII and SSSIII and pea 
soluble starch synthase SSSII have also been reported 
(Edwards et al, 1995; Marshall et al, 1996; Dry et al , 
15 1992) . However, corresponding full length cDNA sequences for 
wheat have hitherto not been available, although a partial 
cDNA sequence (Accession No. U48227) has been released to 
the GenBank database. 

Approach (b) referred to above has been 
20 demonstrated for the gene for granule-bound starch synthase. 
Null alleles on chromosomes 7A, 7D and 4A were identified by 
the analysis of GBSS protein bands by electrophoresis, and 
combined by plant breeding to produce a wheat line 
containing no GBSS, and no amylose (Nakamura et al , 1995) . 

2 5 Subsequently, PCR-based DNA markers have been identified, 

which also identify null alleles for the GBSS loci on each 
of the three wheat genomes. Despite the availability of a 
considerable amount of information in the prior art, major 
problems remain. Firstly, the presence of three separate 

3 0 sets of chromosomes in wheat makes genetic analysis in this 

species extraordinarily complex. This is further 
complicated by the fact that a number of enzymes are 
involved in starch synthesis, and each of these enzymes is 
itself present in a number of forms, and in a number of 
35 locations within the plant cell. Little, if any, 

information has been available as to which specific form of 
each enzyme is expressed in endosperm. For wheat, a limited 
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amount of nucleic acid sequence information is available, 
but this is only cDNA sequence; no genomic sequence, and 
consequently no information regarding promoters and other 
control sequences, is available. Without being able to 
demonstrate that the endosperm-specific gene within a family 
has been isolated, such sequence information is of limited 
practical usefulness. 

SUMMARY OF THE INVENTION 

In this application we report the isolation and 
identification of novel genes from T. tauschii, the D-genome 
donor of wheat, that encode SBE I, SBE II, a 75 kDa SSS I, 
and an isoamylase- type debranching enzyme (DBE) . Because'of 
the very close relationship between T. tauschii and wheat, 
15 as discussed above, results obtained with T. tauschii can be 
directly applied to wheat with little if any modification. 
Such modification as may be required represents routine 
trial and error experimentation. Sequences from these genes 
can be used as probes to identify null or altered alleles in 
wheat, which can then be used in plant breeding programmes 
to provide modifications of starch characteristics. The 
novel sequences of the invention can be used in genetic 
engineering strategies or to introduce a desired gene into a 
host plant, to provide antisense sequences for suppression 
of one or more specific genes in a host plant, in order to 
modify the characteristics of starch produced by the plant. 

By using T. tauschii, we have been able to examine 
a single genome, rather than three as in wheat, and to 
identify and isolate the forms of the starch synthesis genes 
which are expressed in endosperm. By addressing genomic 
sequences we have been able to isolate tissue-specific 
p romoters for the relevant genes , ^ which- provides a -mechanism 
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for simultaneous manipulation of a number of genes in the 
endosperm. Because T. tauschii is so closely related to 
wheat, results obtained with this model system are directly 
applicable to wheat, and we have confirmed this 
experimentally. The genomic sequences which we have 
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determined can also be used as probes for the identification 
and isolation of corresponding sequences, including promoter 
sequences, from other cereal plant species. 

In its most general aspect, the invention provides 
5 a nucleic acid sequence encoding an enzyme of the starch 
biosynthetic pathway in a cereal plant, said enzyme being 
selected from the group consisting of starch branching 
enzyme I, starch branching enzyme II, starch soluble 
synthase I, and debranching enzyme, with the proviso that 

10 the enzyme is not soluble starch synthase I of rice, or 
starch branching enzyme I of rice or maize. 

Preferably the nucleic acid sequence is a DNA 
sequence, and may be genomic DNA or cDNA. Preferably the 
sequence is one which is functional in wheat. More 

15 preferably the sequence is derived from a Triticum species, 
most preferably Triticum tauschii. 

Where the sequence encodes soluble starch 
synthase, preferably the sequence encodes the 75 kD soluble 
starch synthase of wheat. 

20 Biologically-active untranslated control sequences 

of genomic DNA are also within the scope of the invention. 
Thus the invention also provides the promoter of an enzyme 
as defined above. 

In a preferred embodiment of this aspect of the 

25 invention, there is provided a nucleic acid construct 
comprising a nucleic acid sequence of the invention, a 
biologically-active fragment thereof, or a fragment thereof 
encoding a biologically-active fragment of an enzyme as 
defined above, operably linked to one or more nucleic acid 

30 sequences facilitating expression of said enzyme in a plant, 
preferably a cereal plant. The construct may be a plasmid 
or a vector, preferably one suitable for use in the 
transformation of a plant. A particularly suitable vector 
is a bacterium of the genus Agrobacterium, preferably 

35 Agrobacterium tumefaciens . Methods of transforming cereal 
plants using Agrobacterium tumefaciens are known; see for 
example Australian Patent No. 667939 by Japan Tobacco Inc., 
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International Patent Application Number PCT/US97/10621 by 
Monsanto Company and Tingay et al (1997) .. 

In a second aspect, the invention provides a 
nucleic acid construct for targeting of a desired gene to 
endosperm of a cereal plant, and/or for modulating the time 
of expression of a desired gene in endosperm of a cereal 
plant, comprising one or more promoter sequences selected 
from SBE I promoter, SBE II promoter, SSS I promoter, and 
DBE promoter, operatively linked to a nucleic acid sequence 
encoding a desired protein, and optionally also operatively 
linked to one or more additional targeting sequences and/or 
one or more 3' untranslated sequences. 

The nucleic acid encoding the desired protein may 
be in either the sense orientation or in the antisense 
15 orientation. Preferably the desired protein is an enzyme of 
the starch biosynthetic pathway. For example, the antisense 
sequences of GBSS, starch debranching enzyme, SBE II, low 
molecular weight glutenin, or grain softness protein I, may 
be used. Preferred sequences for use in sense orientation 
include those of bacterial isoamylase, bacterial glycogen 
synthase, or wheat high molecular weight glutenin Bxl7 . it 
is contemplated that any desired protein which is encoded by 
a gene which is capable of being expressed in the endosperm 
of a cereal plant is suitable for use in the invention. 

In a third aspect, the invention provides a method 
of modifying the characteristics of starch produced by a " 
plant, comprising the step of: 

(a) introducing a gene encoding a desired enzyme 
of the starch biosynthetic pathway into a host plant, and/or 
30 {b) introducing an anti-sense nucleic acid 

sequence directed to a gene encoding an enzyme of the starch 
*?iggygt he tic pathway--in.tp_a--hos-t--p-l-ant-7 



20 



25 



wherein said enzymes are as defined above. 
Where both steps (a) and (b) are used, the enzymes 
35 in the two steps are different. 

Preferably the plant is a cereal plant, more 
preferably wheat or barley. 
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As is well known in the art, anti-sense sequences 
can be used to suppress expression of the protein to which 
the anti-sense sequence is complementary. It will be 
evident to the person skilled in the art that different 
5 combinations of sense and anti-sense sequences may be chosen 
so as to effect a variety of different modifications of the 
characteristics of the starch produced by the plant. 

In a fourth aspect, the invention provides a 
method of targeting expression of a desired gene to the 
10 endosperm of a cereal plant, comprising the step of 

transforming the plant with a construct according to the 
invention. 

According to a fifth aspect, the invention 

provides a method of modulating the time of expression of a 
15 desired gene in endosperm of a cereal plant, comprising the 

step of transforming the plant with a construct according to 

the second aspect of the invention. 

Where expression at an early stage following 

anthesis is desired, the construct preferably comprises the 
20 SBE II, SSS I or DBE promoters. Where expression at a later 

stage following anthesis is desired, the construct 

preferably comprises the SBE I promoter. 

While the invention is described in detail in 

relation to wheat, it will be clearly understood that it is 
25 also applicable to other cereal plants of the family 

Gramineae, such as maize, barley and rice. 

Methods for transformation of monocotyledonous 

plants such as wheat, maize, barley and rice and for 

regeneration of plants from protoplasts or immature plant 
30 embryos are well known in the art. See for example Lazzeri 

et al, 1991; Jahne et al, 1991 and Wan and Lemaux, 1994 for 

barley; Wirtzens et al, 1997; Tingay et al , 1997; Canadian 

Patent Application No. 2092588 by Nehra,- Australian Patent 

Application No. 61781/94 by National Research Council of 
35 Canada, Australian Patent No. 667939 by Japan Tobacco Co, 

and International Patent Application Number PCT/US97 / 10621 

by Monsanto Company. 
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The sequences of ADP glucose pyrophosphorylase 
from barley (Australian Patent Application No. 65392/94), 
starch debranching enzyme and its promoter from rice 
(Japanese Patent Publication No. Kokai 6261787 and Japanese 
Patent Publication No. Kokai 5317057), and starch 
debranching enzyme from spinach and potato (Australian 
Patent Application No. 44333/96) are all known. 

Detaile d Description of the Drawings 

The invention will be described in detail by 
reference only to the following non-limiting examples and to 
the figures . 

Figure 1 shows the hybridisation of genomic clones 
isolated from T . tauschii. 

DNA was extracted from the different clones, 
digested with BamHI and hybridised with the 5' end of the 
maize SBE I cDNA . Lanes 1, 2, 3 and 4 correspond to DNA 
from clones A.E1, \E2 , \E6 and XE7 respectively. Note that 
clones XE1 and XE2 give identical patterns, the SBE I gene 
in XE6 is a truncated form of that in A.E1, and XE7 gives a 
clearly different pattern. 

Figure 2 shows the hybridisation of DNA from 
T. tauschii. 

DNA from T. tauschii was digested with BamHI and 
the hybridisation pattern compared with DNA from XEl and \E7 
digested with the same enzyme. Fragment El . 1 (see Figure 3) 
from XEl was used as the probe; it contains some sequences 
that are over 80% identical to sequences in E7 . 8 . 
Approximately 25 fig of T . tauschii DNA was electrophoresed 
in lane 1, and 200 pg each of \E1 and \E1 in lanes 2 and 3 , 
respectively. 

gj-gH£g_3 _ s ho ws _t he_ ,r.e s.t r i at i on-maps -of -e-l-one- -A.-E-1- 



and A.E7. The fragments obtained with EcoRI and BamHI are 
indicated. The fragments sequenced from \El are. El . 1 , El . 2 , 
35 a part of El . 7 and a part of El . 5 . 

Figure 4 shows the comparison of deduced amino 
acid sequence of wSBE I-D4 cDNA with the deduced amino acid 
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sequence of rice SBE I (RSBE I; Nakamura et al, 1992), maize 
SBE I ( MSBE I; Baba et al/' 1991), wSBE I-D2 type cDNA (D2 
cDNA; Rahman et al, 1997), pea SBE II (PESBE II, homologous 
to maize SBE I; Burton et al, 1995), and potato SBE I 
5 (POSBE; Cangiano et al, 1993). The deduced amino acid 
sequence of the wSBE I-D4 cDNA is denoted by "D4cDNA" . 
Residues present in at least three of the sequences are 
identified in the consensus sequence in capitals. 

Figure 5 shows the intron-exon structure of 
10 wSBE I-D4 compared to the corresponding structures of rice 
SBE I (Kawasaki et al, 1993) and wSBE I-D2 (Rahman et al, 
1997). The intron-exon structure of wSBE I-D4 is deduced by 
comparison with the SBE I cDNA reported by Repellin et al 
(1997) . 

15 The dark rectangles correspond to exons and the 

light rectangles correspond to introns . The bars above the 
structures indicate the percentage identity in sequence 
between the indicated exons and introns of the relevant 
genes. Note that intron 2 shares no significant sequence 

20 identity and is not indicated. 

Figure 6 shows the nucleotide sequence of part of 
wSBE I-D4, the amino acid sequence deduced from this 
nucleotide sequence, and the N- terminal amino acid sequence 
of the SBE I purified from the wheat endosperm (Morell et 

25 al, 1997) . 

Figure 7 shows the hybridisation of SBE I genomic 
clones with the following probes, 

A. wSBE I-D45 (derived from the 5' end of the 
gene and including sequence from fragments El . 1 and El . 7 ) , 

3 0 and 

B. wSBE I-D43 (derived from the 3* end of the 
gene and containing sequences from fragment El. 5). For 
panel A, the tracks 1-13 correspond to clones XEl , XE2 f XE6 , 
XEl, XE9 , A.E14, XE22, XE21 , Molecular weight markers, \E29 , 

35 XE3Q, \E31 and A,E52 . For panel B, tracks 1-12 correspond to 
clones XEl, XE2 , XE& , XEl , \E9 , \E14, >.E22, XE21 , XE29 , 
XE30, X.E31 and ^E52 . Note that clones XEl and ^E22 .do not 
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hybridise to either of the probes and are wSBE I-D2 type 
genes. Also note that clone A.E30 contains a sequence 
unrelated to SBE I. The size of the molecular weight 
markers in kb is indicated. Clones \E7 and XE22 do 
hybridise with a probe from El . 1 . which is highly conserved 
between wSBE I-D2 and wSBE I-D4. 

Figure 8 shows the alignment of cDNA clones to 
obtain the sequence represented by wSBE I-D4 cDNA . BED4 and 
BED5 were obtained from screening the cDNA library with 
maize BE I (Baba et al , 1991). BED1 , 2 and 3 were obtained 
by RT-PCR using defined primers. 

Figure 9a shows the expression of Soluble Starch 
Synthase I (SSS), Starch Branching Enzyme I (BE I) and 
Starch Branching Enzyme II (BE II) mRNAs during endosperm 
15 development. 

RNA was purified from leaves, florets prior to 
anthesis, and endosperm of wheat cultivar Rosella grown in a 
glasshouse, collected 5 to 8 days after anthesis, 10 to 15 
days after anthesis and 18 to 22 days after anthesis, and 
from the endosperm of wheat cultivar Rosella grown in the 
field and collected 12, 15 and 18 days after anthesis 
respectively. Equivalent amounts of RNA were 
electrophoresed in each lane. The probes were from the 
coding region of the SM2 SSS I cDNA (from nucleotide 1615 to 
25 1919 of the SM2 cDNA sequence); wSBE I-D43C (see Table I), 
which corresponds to the untranslated 3' end of wSBE I-D4' 
CDNA (El (3'; and the 5' region of SBE9 (SBE9 (5 1 ), 
corresponding to the region between nucleotides 743 to 1004 
of Genbank sequence Y11282 . No hybridisation to RNA 
extracted from leaves or preanthesis florets was detected. 

Figure 9b shows the hybridisation of RNA from the 
^^P^_cif_the_hexap.loid-^-^ 
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the starch branching enzyme I gene. The probe, WSBEI-D43, is 
defined in Table 1. 

Figure 9c shows the hybridisation of RNA from the 
endosperm of the hexaploid T. aestivum cultivar "Wyuna" with 
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the starch branching enzyme II gene. The probe, wSBE II-D13, 
is defined in Table 2. 

Figure 9d shows the hybridisation of RNA from the 
endosperm of the hexaploid T. aestiwm cultivar "Gabo" with 
5 the SSS I gene. The probe spanned the region from 

nucleotides 202 5 to 2497 of the SM2 cDNA sequence shown in 
SEQ ID No: 11 . 

Figure 9e shows the hybridisation of RNA from the 
endosperm of the hexaploid T. aestivum cultivar "Gabo" with 
10 the DBE I gene. The probe, a DBE3 ' 3 ' PCR fragment, extends 

from nucleotide position 281 to 1072 of the cDNA sequence in 
SEQ ID No: 16 . 

Figure 9f shows the hybridisation of RNA from the 
endosperm of the hexaploid T. aestivum cultivar "Gabo" with 
15 the wheat actin gene. The probe was a wheat actin DNA 

sequence generated by PCR from wheat endosperm cDNA using 
primers to conserved plant actin sequences . 

Figure 9g shows the hybridisation of RNA from the 
endosperm of the hexaploid T. aestivum cultivar "Gabo" with 
20 a probe containing wheat ribosomal RNA 26S and 18S fragments 
(plasmid pta250.2 from Dr Bryan Clarke, CSIRO Plant 
Industry) . 

Figure 9h shows the hybridisation of RNA from the 
hexaploid wheat cultivar "Gabo" with the DBE I probe 

25 described in Figure 9e. Lane 1; leaf RNA; lane 2, pre- 

anthesis floret RNA; lane 3, RNA from endosperm harvested 12 
days after anthesis . 

Figure 10 shows the comparison of wSBE I-D4 
(sr 427. res ck: 6,3 62,1 to 11,099) and rice SBE I genomic 

30 sequence (dl0838 . em_pl ck: 3, 071,1. to 11 , 700 ) (Kawasaki et 
al, 1993; Accession Number D10838) using the programs 
Compares and DotPlot (Devereaux et al, 1984). The programs 
used a window of 21 bases with a stringency of 14 to 
register a dot. 

35 Figure 11 shows the hybridisation of wheat DNA 

from chromosome-engineered lines using the following probes : 
A. wSBE I-D45 (from the 5' end of the gene), 
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B. wSBE I-D43 (from the 3' end of the gene; 
C. 



wSBE I-D4R (repetitive sequence 
approximately 600 bp 3 • to the end of wSBE I-D4 sequence. 

N7AT7B, no 7 A chromosome, four copies of 7B 
chromosome; N7BT7D, no 7 B chromosome, four copies of 7D 
chromosome; NTDT7A, no 7D chromosome, four copies of 7A 
chromosome. The chromosomal origin of hybridising bands is 
indicated. 

Figure 12 -shows the hybridisation of genomic 
clones Fl. F2, F3 and F4 with the entire SBE-9 sequence 
The DNA from the clones was purified and digested with 
either BamHI or EcoRI. separated on agarose, blotted onto 
nitrocellulose and hybridised with labelled SBE-9 (a SEE II 
15 type cDNA) . The pattern of hybridising bands is different 
in the four isolates. 

Figure 13a shows the N-terminal sequence of 
purified SEE II from wheat endosperm as in Morell et al 
(1997) 



Figure 13b shows the deduced amino acid sequence 
from part of wSBE II-D1 that encodes the N-terminal sequence 
as described in Morell et al, (1997) 

Figure 14 shows the deduced exon-intron structure 
for a part of wSBE li-Dl. The scale is marked in bases. 
2 5 The dark rectangles are exons . 

Figure 15 shows the hybridisation of DNA from 
chromosome engineered lines of wheat (cultivar Chinese 
Spring) with a probe from nucleotides 550-850 from SBE-9 
The band of approximately 2.2 kb is missing in the line in 
30 which chromosome 2D is absent. 

T2BN2A: four copies of chromosome 2B, no copies 
of chromo so me 2A; 

T2AN2B: four copies of chromosome 2A, no copies 
of chromosome 2B; 

35 T2AN2D: 
of chromosome 2D. 



four copies of chromosome 2A, no copies 
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Figure 16 shows the N-terminal sequence of SSS I 
protein isolated from starch granules (Rahman et al , 1995) 
and deduced amino acid sequence of part of Sm2 . 

Figure 17 shows the hybridisation of genomic 
5 clones sgl, 3, 4, 6 and 11 with the cDNA clone (sm2) for SSS 
I. DNA was purified from indicated genomic clones, digested 
with BairiKI or Sad and hybridised to sm2 . Note that the 
hybridisation patterns for sgl, 3 and 4 are clearly 
different from each other. 
10 Figure 18 shows a comparison of the intron/exon 

structures of the wheat and rice soluble starch synthase 
genomic sequences . The dark rectangles indicate exons and 
the light rectangles represent introns . 

Figure 19 shows the hybridisation of DNA from 
15 chromosome engineered lines of wheat (cultivar Chinese 
Spring) digested with PvtjII, with the sm2 probe. 

N7AT7B : no 7A chromosome , four copies of 7B 
chromosome ; 

N7BT7D: no 7B chromosome, four copies of 7D 
20 chromosome; 

N7DT7A : no 7D chromosome , four copies of 7A 
chromosome. 

A band is missing in the N7BT7 A line. 
Figure 20a shows the DNA sequence of a portion of 
25 the wheat debranching enzyme (WDBE-l)PCR product. The 

PCR product was generated from wheat genomic DNA (cultivar 
Rosella) using primers based on sequences conserved in 
debranching enzymes from maize and rice. 

Figure 2 0b shows a comparison of the nucleotide 
30 sequence of wheat debranching enzyme I (WDBE-I) PCR fragment 
(WHEAT. DNA) with the maize Sugary-1 sequence (SUGARY. DNA) . 

Figure 20c shows a comparison between the 
intron/exon structures of wheat debranching enzyme gene and 
the maize sugary-1 debranching enzyme gene. 
3 5 Figure 2 la shows the results of Southern blotting 

of T. tauschii DNA with wheat DBE-I PCR product. DNA from 
T. tauschii was digested with BamHI , electrophoresed, 
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blotted and hybridised to the wheat DBE-I PCR product 
described in Figure 20a. A band of approximately 2 kb 
hybridised. 

Figure 21b shows Chinese Spring nullisomic/ 
5 tetrasomic lines probed with probes from the DBE gene Panel 
(I) shows hybridisation with a fragment spanning the region 
from nucleotide 270 to 465 of the cDNA sequence shown in SEQ 
ID No: 16 from the central region of the DBE gene. Panel 
(ID shows hybridisation with a probe from the 3' region of 
the gene, from nucleotide 281 to 1072 of the cDNA sequence 
given in SEQ ID No: 16. 

Figures 22a to 22e show diagrammatic 
representations of the DNA vectors used for transient 
expression analysis. In each of the sequences the N-terminal 
methionine encoding ATG codon is shown in bold. 

Figure 22a shows a DNA construct pwssslprolgf pNOT 
containing a 1042 base pair region of the wheat soluble 
starch synthase I promoter (wSSSIprol, from -1042 to -1, SEQ 
ID No:18) fused to the green fluorescent protein (GFP) 
20 reporter gene. 

Figure 22b shows a DNA construct pwsssIpro2gfpN0T 
containing a 3914 base pair region of the wheat soluble 
starch synthase I promoter (wSSSIpro2, from -3914 to -1, SEQ 
ID No: 18) fused to the green fluorescent protein (GFP) 
25 reporter gene. 

Figure 22c shows a DNA construct psbellprolgf pNOT 
containing an 1203 base pair region of the wheat starch 
branching enzyme II promoter (sbellprol. from 1 to 1023 SEQ 
ID No: 10 fused to the green fluorescent protein (GFP) 
30 reporter gene. 

Figure 22d shows a DNA construct psbeIIpro2gf pNOT 
J«mtai^_a_ia53_^ 



35 



- — otaitu 

branching enzyme II promoter and transit peptide coding 
region (sbellpro2, regions 1-1203, 1204 to 1336 and 1664 to 
1680 of SEQ ID No: 10 fused to the green fluorescent protein 
(GFP) reporter gene. 

Figure 22e shows a DNA construct pact.jsgf g_nos 
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containing the plasmid backbone of pSP72 (Promega), the rice 
ActI actin promoter (McElroy et al . 1991), the GFP gene 
(Sheen et al . 19 95) and the Agrojbacterium tumefaciens 
nopaline synthase (nos) terminator (Bevan et al . 1983). 
5 Figure 23 shows T DNA constructs for stable 

transformation of rice by Agrobacterium . The backbone for 
each plasmid is p35SH-iC (Wang et al 1997) . The various 
promoter-GFP-Nos regions inserted are shown in (a) , (b) , (c) 
and (d) respectively, and are described in detail in Example 
10 24. Each of these constructs was inserted into the NotI 

site of p35SH-iC using the NotI flanking sites at each end 
of the promoter-GFP-Nos regions. The constructs were named 
(a) p35SH-iC-BEIIprol_GFP_Nos, (b) p3 5SH-iC-BEIIpro2_GFP_Nos 
(c) p35SH-iC-SSIprol_GFP_Nos and (d) p35SH-iC- 
15 SSIpro2_GFP_Nos 

Figure 24 illustrates the design of 15 intron- 
spanning BE II primer sets. Primers were based on 
wSBE II-D1 sequence (SEQ ID No:10), and were designed such 
that intron sequences in the wSBE II-D1 sequence (deduced 
20 from Figure 13b and Nair et al, 1997; Accession No. Y11282) 
were amplified by PCR. 

Figure 25 shows the results of. amplification using 
the SBE II-Intron 5 primer set (primer set 6: sr913F and 
WBE2E6 R ) on various diploid, tetraploid and hexaploid 
2 5 wheats. 

i ) T. jboeodicum (A genome diploid) 

ii) T . tauschii (D genome diploid) 

iii) T . aestivum cv . Chinese Spring ditelosomic line 
2AS (lacking chromosome arm 2AL) 

30 iv)Crete 10 (AABB tetraploid) 

v)T. aestivum cv Rosella (hexaploid) 
The horizontal axis indicates the size of the 
product in base pairs, the vertical axis shows arbitrary 
fluorescence units. The various arrows indicate the products 
35 of different genomes: A, A genome, B, B genome, D, D genome, 
U, unassigned additional product . 
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Figure 2 6 shows the results obtained by 
amplification using the SBE II-Intron 10 primer set (primer 
set 11: da5.seq and WBE2E11R on the wheat lines: 

(i)T. aestivum cv. Chinese Spring ditelosomic line 
2 AS . 

(«i) T. aestivum Chinese Spring 

nullisomic/tetrasomic line N2BT2A. 
(Hi) T . aestivum Chinese Spring 

nullisomic/tetrasomic line N2DT2B. 
The horizontal axis indicates the size of the 
product in base pairs, the vertical axis shows arbitrary 
fluorescence units. The various arrows indicate the products 
of different genomes: A, A genome, B, B genome, D, D genome. 

Figure 27 shows the results of transient 
expression assays typical of each promoter and target 
tissue. The photographs (40 x magnification) of 
representative tissue resulting from the transient 
expression assays typical of each promoter and target tissue 
revealed under a Leica microscope with blue light 
illumination. Photographs were taken 48 to 72 hours after 
tissue bombardment. The promoter constructs are listed as 
follows, (with the panels showing endosperm, embryo and leaf 
expression listed in respective order): pact_j sgfp_nos 
(panels a,g and m) ; pwsssIprolgfpNOT (panels b, h and n) ; 
pwsssI P ro2gf P NOT (panels c, i and o) ; psbellprolgf pNOT 
(panels d, j and p) ; P sbeIIpro2gfpNOT (panels e, k and q) ; 
pZLgfpNOT (Panels f , 1 and r) . 

Example 1 Identification of Gene Encoding SBE I 

Construction of Genomic Library and Isolation of Clones 

The genomic library used in this study was 
J=^nsj:jrucx.ed ^ 



accession number CPI 100799. of all the accessions of 
T. tauschii surveyed, the genome of CPI 100799 is the most 
closely related to the D genome of hexaploid wheat. 
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Triticum tauschii , var strangulata (CPI accession 
number 1107 99) was kindly provided by Dr E Lagudah. Leaves 
were isolated from plants grown in the glasshouse. 

DNA was extracted from leaves of Triticum tauschii 
5 using published methods (Lagudah et al, 1991), partially 
digested with Sau3A, size fractionated and ligated to the 
arms of lambda GEM 12 (Promega) . The ligated products were 
used to transfect the methylation- tolerant strain PMC 103 
(Doherty et al . 1992). A total of 2 x 10 6 primary plaques 
10 were obtained with an average insert size of about 15 kb. 
Thus the library contains approximately 6 genomes worth of 
T. tauschii DNA. The library was amplified and stored at 
4°C until required . 

Positive plaques in the genomic library were 
15 selected as those hybridising with the 5' end of a maize 
starch branching enzyme I cDNA (Baba et al , 1991) using 
moderately stringent conditions as described in Rahman et 
al, (1997) . 

2 0 Preparation of Total RNA from Wheat 

Total RNA was isolated from leaves, pre-anthesis 
pericarp and different developmental stages of wheat 
endosperm of the cultivar, Hartog and Rosella. This 
material was collected from both the glasshouse and the 

25 field. The method used for RNA isolation was essentially 

the same as that described by Higgins et al (1976) . RNA was 
then quantified by UV absorption and by separation in 
1.4% agarose- formaldehyde gels which were then visualized 
under UV light after staining with ethidium bromide 

30 (Sambrook et al, 1989) . 

DNA and RNA analysis 

DNA was isolated and analysed using established 
protocols (Sambrook et al, 1989) . DNA was extracted from 

3 5 wheat (cv. Chinese Spring) using published methods (Lagudah 

et al, 1991). Southern analysis was performed essentially 
as described by Jolly et al (1996) . Briefly, 20 |ig wheat 
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DNA was digested, electrophoresed and transferred to a nylon 
membrane. Hybridisation was conducted at 42°C in 25% or 
50% formamide, 2 x SSC, 6% Dextran Sulphate for 16h and the 
membrane was washed at 60°C in 2 x SSC for 3 x lh unless 
otherwise indicated. Hybridisation was detected by 
autoradiography using Fuji x-Omat film. 

RNA analysis was performed as follows. 10 ng of 
total RNA was separated in a 1.4% agarose- formaldehyde gel 
and transferred to a nylon Hybond N + membrane (Sambrook et 
al, 1989 ), and hybridized with cDNA probe at 42°C in 
Khandjian hybridizing buffer (Khandjian, 1989). The 3' part 
of wheat SEE I cDNA (designated wSBE I-D43, see Table 1) was 
labelled with the Rapid Multiprime DNA Probe Labelling Kit 
(Amersham) and used as probe. After washing at 60*C with 
2 x SSC, 0.1% SDS three times, each time for about 1 to 
2 hours, the membrane was visualized by overnight exposure 
at -80°C with X-ray film, Kodak MR. 

Example 2 Frequency of Recovery of SBE I Type Clones 

from the Genomic Library 
An estimated 2 x 10 plaques from the amplified 
library were screened using an EcoRI fragment that contained 
1200 bp at the 5' end of maize SBE I (Baba et al. 1991) and 
twelve independent isolates were recovered and purified 
This corresponds to the screening of somewhat fewer than the 
2 x 10 primary plaques that exist in the original library 
(each of which has an average insert size of 15 kb) 
(Maniatis et al. 1982), because the amplification may lead 
to the representation of some sequences more than others 
Assuming that the amplified library contains approximately 
three genomes of T. tauschii, the frequency with which 
SBE I -positive clones were recovered suggests the- existence " 
of about 5 copies of SBE I type genes within the T. tauschii 
genome . 

Digestion of DNA from the twelve independent 
isolates by the restriction endonuclease BamHI followed by 
hybridisation with a maize SBE I clone, suggested that the 
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genomic clones could be separated into two broad classes 
(Figure 1) . One class had 10 members and a representative 

from this class is the clone XE1 (Figure 1, lane 1); XE6 
(Figure 1, lane 3) is a member of this class, but is missing 
5 the 5* end of the El-SBE I gene because the SBE I gene is at 

the extremity of the cloned DNA. Further hybridisation 

studies at high stringency with the extreme 5 1 and 

3 1 regions of the SBE I gene contained in A,E1 suggested that 
the other clones contained either identical or very closely 

10 related genes. 

The second family had two members, and of these 
clone XEl (Figure 1, lane 4) was arbitrarily selected for 
further study. These two members did not hybridise to 
probes from the extreme 5 ' and 3 1 regions of the SBE I gene 

15 that were contained in A.E1, indicating that they were a 
distinct sub-class . 

The DNA from T. tauschii and the lambda clones \E1 
and XE1 was digested with Ba/nHI and hybridised with 
fragment El.l, as shown in Figure 2. This fragment contains 

20 sequences that are highly conserved (85% sequence identity 
over 0.3 kB between A.E1 and A,E7 ) , corresponding to exons 3, 

4 and 5 of the rice gene. The bands in the genomic DNA at 
0.8 kb and 1.0 kb correspond to identical sized fragments 
from X.E1 and X.E7 , as shown in Figure 2; these are 

25 fragments El.l and E7 . 8 of \E1 and XEl genomic clones 

respectively. Thus the arrangement of genes in the genomic 
clones is unlikely to be an artefact of the cloning 
procedure . There are also bands in the genomic DNA of 
approximately 2.5 kb, 4.8 kb and 8 kb in size which are not 

30 found from the digestion of \E1 or A.E7 ; these could 

represent genes such as the 5 f sequences of wSBE I-Dl or 
wSBE I-D3; see below. 

Example 3 Tandem Arrangement of SBE I Type Genes in 

35 the T. tauschii Genome 

Basic restriction endonuclease maps for A,E1 and 
^E7 are shown in Figure 3 . The map was constructed by 
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performing a series of hybridisations of EcoRI or BamHI 
digested DNA from XE1 or XE7 . The probes used were the 
fragments generated from SarnHI digestion of. the relevant 
clone. Confirmation of the maps was obtained by PGR 
analysis, using primers both within the insert and also from 
the arms of lambda itself. PC R was performed in 10 ul 
volume using reagents supplied by Perkin-Elmer . The primers 
were used at a concentration of 20 UK. The program used was 
94 c. 2 nan, 1 cycle, then 94°C, 30 sec; 55°C, 30 sec; 72°C 
1mm for 36 cycles and then 72°C. 5 min; 25°C, 1 min. 

Sequencing was performed on an ABI sequencer using 
the manufacturer's recommended protocols for both dye primer 
and dye terminator technologies. Deletions were carried out 
using the Erase-a-base kit from Promega . 

Sequence analysis was carried out using the GCG 
version 7 package of computer programs (Devereaux et al, 
1984) . 

The PCR products were also .used' as hybridisation 
probes. The positioning of the genes was derived from 
sequencing the ends of the BamHI subclones and also from 
sequencing PCR products generated from primers based on the 
insert and the- lambda arms. The results indicate that there 
is only a single copy of a SEE I type gene within XE1 . 
However, it is clear that XE7 resulted from the cloning of a 
DNA fragment from within a tandem array of the SEE I type 
genes. Of the three genes in the clone, which are named as 
wSBE i-Dl, wSBE I-D2 and wSBE I-D3); only the central one 
(wSBE I-D2) is complete. 

30 Example 4 Construc tion and Screening of cDNA Library 

A wheat cDNA library was constructed from the 
cultivar^Ros-ella^using-pooled RNA"f r om~ endbs pe rm at g — 15 

Tfi anrl On ^3 _.c . . ' 
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18 and 20 days after anthesis. 

The cDNA library was prepared from poly A + RNA 
that was extracted from developing wheat grains (cv. 
Rosella, a hexaploid soft wheat cultivar) at 8 12 15 18 
21 and 30 days after anthesis. The RNA was pooled and'used 
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to synthesise cDNA that was propagated in lambda ZapII 
(Stratagene) . 

The library was screened with a genomic fragment 
from \E1 encompassing exons 3, 4 and 5 (fragment E7 . 8 in 
5 Figure 3). A number of clones were isolated. Of these an 
apparently full-length clone appeared to encode an unusual 
type of cDNA for SBE I. This cDNA has been termed SBE I-D2 
type cDNA. The putative protein product is compared with 
the maize SBE I and rice SBE I type deduced amino acid 
10 sequences in Figure 4. The . main difference is that this 

putative protein product is shorter at the C-terminal end, 
with an estimated molecular size of approximately 74 kD 
compared with 85 kDa for rice SBE I (Kawasaki et al, 1993) . 
Note that amino acids corresponding to exon 9 of rice are 
15 missing in SBE I-D2 type cDNA, but those corresponding to 
exon 10 are present. There are no amino acid residues 
corresponding to exons 11-14 of rice; furthermore, the 
sequence corresponding to the last 57 amino acids of 
SBE I-D2 type has no significant homology to the sequence of 
20 the rice gene. 

We expressed SBE I-D2 type cDNA in E. coli in 
order to examine its function. The cDNA was expressed as a 
fusion protein with 22 N-terminal residues of p-galacto- 
sidase and two threonine residues followed by the SBE I-D2 
25 cDNA sequence either in or out of frame. Although an 

expected product of about 75 kDa in size was produced from 
only the in-frame fusion, we could not detect any enzyme 
activity from crude extracts of E. coli protein. 
Furthermore the in-frame construct could not complement an 
30 E. coli strain with a defined deletion in glycogen 

branching, although other putative, branching enzyme cDNAs 
have been shown to be functional by this assay (data not 
shown) . It is therefore unclear whether the wSBE I-D2 gene 
in A.E7 codes for an active enzyme in vivo. 
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Example 5 Gene Structure in E7 

i. Sequence of wSBE I-D2 

We sequenced 9.2 kb of DNA that contained 
wSBE I-D2. This corresponds to fragments 7.31, 7.8 and 
7.18. Fragment 7.31 was sequenced in its entirety (4.1 kb) , 
but the sequence of about 3 0 bases about 2 kb upstream of 
the start of the gene could not be obtained because it was 
composed entirely of Gs . Elevation of the temperature of 
sequencing did not overcome this problem. Fragments 7.8 
(1 kb) and 7.18 (4 kb) were completely sequenced, and 
corresponded to 2 kb downstream of the last exon detected 
for this gene. It was clear that we had isolated a gene 
which was closely related (approximately 95% sequence 
identity) to the SBE I-D2 type cDNA referred to above, 
except that the last 200 bp at the 3' end of the cDNA are 
not present. The wSBE I-D2 gene includes sequences 
corresponding to rice exon 11 which are not in the cDNA 
clone. In addition it does not have exons 9, 12, 13 or 14; 
these are also absent from the SBE I-D2 type cDNA. The 
first two exons show lower identity to the corresponding 
exons from rice (approximately 60%) (Kawasaki et al, 1993) 
than to the other exons (about 80%) . A diagrammatic exon- 
intron structure of the wSBE I-D2 gene is indicated in 
Figure 5. The restriction map was confirmed by sequencing 
the PCR products that spanned fragments 7.18 and 7.8 and 7.8 
and E7.31 (see Figure 3) respectively. 

ii. Sequence of wSBE I-D3 

This gene was not sequenced in detail, as the 
genomic clone did not extend far enough to include the 5' 
end of the sequence. The sequence is of a SBE- I type. The 

-orientation- of~t-he-gene~rs-evident"fr^ 

relevant BamHI fragments, and was confirmed by sequence 
analysis of a PCR product generated using primers from the 
right arm of lambda and a primer from the middle of the 
gene. The sequence homology with wSBEI-D2 is about 80% over 
the regions examined. The 2 kb sequenced corresponded to 
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exons 5 and 6 of the rice gene; these sequences were 
obtained by sequencing the ends of fragments 7.5, 7.4 and 
7.14 respectively, although the sequences from the left end 
of fragment 7.14 did not show any homology to the rice 
5 sequences. The gene does not appear to share the 3 f end of 
SBE I-D2 type cDNA, as a probe from 500 bp at the 3' end of 
the cDNA (including sequences corresponding to exons 8 and 
10 from rice) did not hybridise to fragment 7.14, although 
it hybridised to fragment 7.18. 

10 

iii. Sequence of wSBE I-Dl 

This gene was also not sequenced in detail, as it 
was clear that the genomic clone did not extend far enough 
to include the 5' sequences. Limited sequencing suggests 

15 that it is also a SBE I type gene. The orientation relative 
to the left arm of lambda was confirmed by sequencing a PCR 
product that used a primer from the left arm of lambda and 
on.e from the middle of the gene (as above) . Its sequence 
homology with wSBE I-D2 , D3 and D4 (see below) is about 7 5% 

20 in the region sequenced corresponding to a part of exon 4 of 
the rice gene . 

Starch branching enzymes are members of the a- 
amylase protein family, and in a recent survey Svensson 
(1994) identified eight residues in this family that are 

25 invariant, seven in the catalytic site and a glycine in a 
short turn. Of the seven catalytic residues, four are 
changed in SBE I-D2 type. However, additional variation in 
the 'conserved' residues may come to light when more plant 
cDNAs for branching enzyme I are available for analysis. In 

30 addition, although exons 9, 11, 12, 13 and 14 from rice are 
not present in the SBE I-D2 type cDNA, comparison of the 
maize and rice SBE I sequences indicate that the 3' region 
(from amino acid residue 730 of maize) is much more variable 
than the 5' and central regions. The active sites of rice 

35 and maize SBE I sequences, as indicated by Svensson (1994), 
are encoded by sequences that are in the central portion of 
the gene. When SBE II sequences from Arabidopsis were 
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compared by Fisher et al (1996) they also found variation at 
the 3' and 5' ends. SBE I-D2 type cDNA may encode a -novel 
type of branching enzyme whose activity is not adequately 
detected in the current assays for detecting branching 
enzyme activity; alternatively the cDNA may correspond to an 
endosperm mRNA that does not produce a functional protein. 

Example 6 Cloning of the cDNA corresponding to the 

wSBE I-D4 gene 
The first strand cDNAs were synthesized from 1 jig 
of total RNA, derived from endosperm 12 days after 
pollination, as described by Sambrook et al (1989), and then 
used as templates to amplify two specific cDNA regions of 
wheat SBE I by PCR . 

15 Two P air s of primers were used to obtain the cDNA 

clones BED1 and BED3 (Table 1). Primers used for cloning of 
BED3 were the degenerate primer NTS 5 ' 



10 



20 



5' GGC NAG NGC NGA G/AGA C/TGG 3' < SEQ ID NO . 1 ) , 



based on the N-terminal sequence of the purified 
wheat endosperm SBE I protein, in which the 5' end of the 
primer is at position 168 of wSBE I-D4 cDNA, as shown in 
Table 1, based on the N-terminal sequence of wheat SBE I, 
25 and the primer NTS 3 ' . 

5' TAG ATT TCC TTG TCC ATCA 3' (SEQ ID NO. 2) 

in which the 5' end is at position 1590 of 
30 wSBE I-D4 cDNA, (see Table 1), designed to anneal to the 
conserved regions of the nucleotide sequences of BEDS and 

_thjL_maiz_e_and^ — 

primers used were BEC5 ' 

35 5' ATC ACG AGA GGT TGC TCA (SEQ ID NO . 3 ) 
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in which the 5' end is at position 1 of wSBE I-D4 
cDNA (see Table 1); the sequence was based on the wSBE I-D4 
gene, and BEC3 * 

5 5' CGG TAC AC A GTT GCG TCA TTT TC 3* ( SEQ ID NO . 4 ) 



in which the 5* end is at position 334 of 
wSBE I-D4 cDNA (see Table 1) , and the sequence was based on 
BED 3 . 

10 

Example 7 Identification of the gene from the Triticum 

tauschii SBE I family which is expressed in 
the endosperm 
We have isolated two classes of SBE I genomic 

15 clones from T. tauschii. One class contained two genomic 
clone isolates, and this class has been characterised in 
some detail (Rahman et al, 1997). The complete gene 
contained within this class of clones was termed wSBE I-D2; 
there were additional genes at either ends of the clone, and 

20 these were designated wSBE I-Dl and wSBE J-D3. The other 
class contained nine genomic clone isolates. Of these A.E1 
was arbitrarily taken as a representative clone, and its 
restriction map is shown in Figure 3 ; the SBE I gene 
contained in this clone was called wSBE I-D4. 

25 Fragments El . 1 (0.8 kb) and El . 2 (2.1 kb) and 

fragments El. 7 (4.8 kb) and El . 5 (3 kb) respectively were 
completely sequenced. Fragment El . 7 was found to encode the 
N-terminal of the SBE I, which is found in the endosperm as 
described in Morell et al (1997) . This is shown in 

30 Figure 6. Using antibodies raised against the N-terminal 
sequence, Morell et al (1997) found that the D genome 
isoform was the most highly expressed in the cultivars 
Rosella and Chinese Spring. We have thus isolated from 
T . tauschii a gene, wSBE I-D4, whose homologue in the 

35 hexaploid wheat genome encodes the major isoform for SBE I 
that is found in the wheat endosperm . 
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Table 1 

Location of structural features and probes within wSBE I- 

sequence . 

5 p™??^ 10 ? °f SXOnS by com Parison with the cDNA sequence 
Repellxn et al . , ( 1997 ). Accession number Y12320 . 

Exon number 

10 i 
2 
3 
4 
5 

15 6 
7 
8 
9 

10 

20 ii 
12 
13 
14 

25 B. Other features. 
Name of feature. 

Putative initiation of translation 
Mature N-terminal sequence of SBE I 
End of translated SBE I sequence 
End of D4 cDNA secaience 
wSBE I-D45 
35 wSBE I-D43 
El.l 
BED 1 
BED 2 
BED 3 
40 BED 4 
BED 5 

Endosperm box like motif TGAAAAGT 4480 590 

CAAAT motif 4863 
TATAAA motif AO ->^ 
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Start posn 


End posn 


4890 


4987 


5082 


5149 


5524 


5731 


5819 


5888 


6149 


6318 


6519 


7424 


7744 


7860 


8015 


. 8077 


8562 


8670 


9137 


9237 


9421 


9488 


9580 


9661 


9781 


9897 


9990 


10480 



wSBE I-D4. 
sequence 

4900 
5550 
10225 
10461 
4870, 5860 
10116, 10435 
5680, 6400 



D4 cDNA 
sequence 

11 
124 
2431 
2687 
1,354 
2338, 265' 
380, 630 
1, 354 
169, 418 
151, 1601 
867, 2372 
867, 2687 
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All nine genomic clones of the \E1 type isolated 
from T. tauschii appear to contain the wSBE I-D4 gene, or 
very similar genes, on the basis of PCR amplification and 
hybridisation experiments. However, the restriction 
5 patterns obtained for the clones differ with BawHl and 
EcoRI , among other enzymes, indicating that either the 
clones represent near-identical but distinct genes or they 
represent the same gene isolated in distinct products of the 
Sau2A digest used to generate the library. 

10 

Example 8 Investigation of other SBE I genomic clones 

isolated 

All ten members of the XEl-like class of SBE I 
genomic clones were investigated by hybridisation with 

15 probes derived from fragment El . 7 (sequence wSBE I-D45, 
encoding the translation start signal and the first 
100 amino acids from the N-terminal end and intron 
sequences; see Table 1) and from fragment El . 5 (sequence 
wSBE I-D43 , corresponding largely to the 3' untranslated 

20 sequence and containing intron sequences, see Table 1) . The 
results obtained were consistent with one type of gene being 
isolated in different fragments in the different clones, as 
shown in Figure 7 . The PCR products were obtained from the 
clones A.E1, 2, 9, 14, 27, 31 and 52. These hybridised to 

25 wSBE I-D45 using primers that amplify near the 5 1 end of the 
gene (positions 5590-6162 of wSBE I-D4) . Sequencing showed 
no differences in sequence of a 200 bp product. 

Analysis of the promoter for wSBE I-D4 allows us 
to investigate the presence of motifs previously described 

30 for promoters that regulate gene expression in the 

endosperm. Forde et al (1985) compared prolamin promoters, 
and suggested that the presence of a motif approximately 
-300 bp upstream of the transcription start point, called 
the endosperm box, was responsible for endosperm-specific 

35 expression. The endosperm box was subsequently considered 

to consist of two different motifs: the endosperm motif (EM) 
(canonical sequence TGTAAAG) and the GCN 4 motif (canonical 
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sequence G/ATGAG/CTCAT) . The GCN4 box is considered to 
regulate expression according to nitrogen availability 
(Muller and Knudsen, 1993) . The wSBE I-D4 promoter contains 
a number of imperfect EM-like motifs at approximately -100, 
-300 and -400 as well as further upstream. However, no GCN4 
motifs could be found, which lends support to the idea that 
this motif regulates response to nitrogen, as starch 
biosynthesis is not as directly dependent on the nitrogen 
status of the plant as storage protein synthesis. Comparison 
of the promoters for wSBE I-D4 and D2 (Rahman et al, 19 97) 
indicates that although there are no extensive sequence 
homologies there is a region of about 100 bp immediately 
•before the first encoded methionine where the homology is 
61% between the two promoters, in particular there is an 
15 almost perfect match in the sequence over twenty base pairs 
CTCGTTGCTTCC / TACTCCACT , (positions 4723-4742 of the wSBE I 
sequence) , but the significance of this is hard to gauge, as 
it does not occur in the rice promoter for SBE I. The 
availability of more promoters for starch biosynthetic 
20 enzymes may allow firmer conclusions to be drawn. There are 
putative CAAT and TATA motifs at positions 4870 and 4830 
respectively of wSBE I-D4 sequence. The putative start of 
translation of the mRNA is at position 4900 of wSBE I-D4. 

Figure 5 shows the structure of the wSBE I-D4 
25 gene, compared with the genes from rice and wheat (Kawasaki 
et al, 1993; Rahman et al, 1997) . The rice SBE I has 14 
exons compared with 13 for wSBE I-D4 and 10 for wSBE I-D2. 
There is good conservation of exon-intron structure between 
the three genes, except at the extreme 5' end. In particular 
the sizes of intron 1 and intron 2 are very different 
between rice SBE I and wSBE I-D4 . 
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Example 9 Isolation of cDNA for SBE I 

Using the maize starch branching enzyme I cDNA as 
35 a probe (Baba et al, 1991), 10 positive plaques were 

recovered by screening approximately 10 s plaques from a 
wheat endosperm cDNA library prepared from the cultivar 
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Rosella, as described in Example 4. On purifying and 
sequencing these plaques it was clear that even the longest 
clone (BEDS , 1822 bp) did not encode the N-terminal sequence 
obtained from protein analysis. Degenerate primers based on 
5 the wheat endosperm SBE I protein N-terminal sequence 

(Morell et al, 1997) and the sequence from BEDS were then 
used to amplify the 5 1 region: this produced a cDNA clone 
termed BED 3 (Table 1 and Figure 8) . This cDNA clone 
overlapped extensively and had 100% sequence identity with 

10 BEDS and BED4 (Figure 8) . As almost the entire protein N- 
terminal sequence had been included in the primer sequence 
design, this did not provide independent evidence of the 
selection of a cDNA sequence in the endosperm that encoded 
the protein sequence of the main form of SBE I. Using a 

15 BED3 to screen a second cDNA library produced BED2 , which is 
shorter than BED3 but confirmed the BED3 sequence at 100% 
identity between positions 169 and 418 (Figure 8 and 
Table 1 ) . In addition the entire cDNA sequence for BED3 
could be detected at a 100% match in the genomic clone X.E1 . 

20 Primers based on the putative transcription start point 
combined with a primer based on the incomplete cDNAs 
recovered were then used to obtain a PCR product from total 
endosperm RNA by reverse transcription. This led to the 
isolation of the cDNA clone, BED1 , of 300 bp, whose location 

25 is shown in Figure 8. By analysing this • product , a sequence 
was again obtained that could be found exactly in the 
genomic clone \E1 , and which overlapped precisely with BED3 . 

The N-terminal of the protein matches that of 
SBE I isolated from wheat endosperm by Morell et al (1997), 

3 0 and thus the wSBE cDNA represents the gene for the 

predominant SBE I isoform expressed in the endosperm. The 
encoded protein is 87 kDa; this is similar to proteins 
encoded by maize (Baba et al, 1991) and rice (Nakamura et 
al, 1992) cDNAs for SBE I and is distinct from the wSBE I-D2 

3 5 cDNA described previously, in which the encoded protein was 
74 kDa (Rahman et al, 1997) . 
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Five cDNA clones were sequenced and their 
sequences were assembled into one contiguous sequence using 
a GCG program (Devereaux et al, 1984). The arrangement of 
these sequences is illustrated in Figure 8, the nucleotide 
sequence is shown in SEQ ID No: 5, and the deduced amino acid 
sequence is shown in SEQ ID No : 6 . The intact cDNA sequence, 
wSBE I-D4 cDNA, is 2687 bp and contains one large open 
reading frame (ORF) , which starts at nucleotides 11 to 13 
and ends at nucleotides 2432 to 2434. It encodes a 
polypeptide of 807 amino acids with a molecular weight of 
87 kDa. Comparison of the amino acid sequence encoded by 
wSBE I-D4 cDNA with that encoded by maize and rice SBE I 
cDNAs showed that- there is 75-80% identity between any of 
two these sequences at the nucleotide level and almost 90% 
15 at the amino acid level. Alignment of these three 

polypeptide sequences, as shown in Figure 4, along with the 
deduced sequences for pea, potato and wSBE I-D2 type cDNA, 
indicated that the sequences in the central region are 
highly conserved, and sequences at the 5' end (about 
80 amino acids) and the 3' end (about 60 amino acids) are 
variable . 

Svensson et a J (1994) indicated that there were 
several invariant residues in sequences of the a-amylase 
super-family of proteins to which SBE I belongs. In. the 
25 sequence of maize SBE I these are in motifs commencing at 
amino acid residue positions 341, 415, 472, 537 
respectively; these are also encoded in the wSBE I-D4 
sequence (SEQ ID No : 9 ) , further supporting the view that 
this gene encodes a functional enzyme. This is in contrast 
to the results with the wSBE I-D2 gene, where three of the 
conserved motifs appear not to be encoded (Rahman et al, 
19-9-7-)-. : 

There is about 90% sequence identity in the 
deduced amino acid sequence between wSBE I-D4 cDNA and rice 
35 SBE I cDNA in the central portion of the molecule (between 

residues 160 and 740 for the deduced amino acid product from 
wSBE I-D4 cDNA) . The sequence identity of the deduced amino 
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acid sequence of the wSBE I-D4 cDNA to the deduced amino 
acid sequence of wSBE I-D2 is somewhat lower (85% for the 
most conserved region, between residues 285 to 390 for the 
deduced product of wSBE I-D4 cDNA) . Surprisingly, however, 
5 wSBE I-D4 cDNA is missing the sequence that encodes amino 
acids at positions 30 to 58 in rice SBE I (see Figure 4) . 
This corresponds to residues within the transit peptide of 
rice SBE I. A corresponding sequence also occurs in the 
deduced amino acid sequence from maize SBE I (Baba et al , 

10 1991) and wSBE I-D2 type cDNA (Rahman et al , 1997). 

Consequently the transit sequence encoded by wSBE I-D4 cDNA 
is unusally short, containing only 3 8 amino acids, compared 
with 55-60 amino acids deduced for most starch biosynthetic 
enzymes in cereals (see for example Ainsworth, 1993; Nair et 

15 al, 1997) . The wSBE I-D4 gene does contain this sequence, 
but this does not appear to be transcribed into the major 
species of RNA from this gene, although it can be detected 
at low relative abundance. This raises the possibility of 
alternative splicing of the wSBE I~D4 transcript, and also 

20 the question of the relative efficiency of 

translation/ transport of the two isoforms. The possibility 
of alternative splicing in both rice and wheat has been 
considered for soluble starch synthase (Baba et al, 1993 
Rahman et al, 1995) . Alternative splicing of soluble starch 

25 synthase would give a transit sequence of 40 amino acids, 
which is the same length proposed for the product of 
wSBE I-D4 cDNA. 

We have previously used probes based on exons 4, 5 
and 6 (E7.8 and El.l, see Rahman et al . , 1997) of WSBE-D2 to 

3 0 probe wheat and T . tauschii genomic DNA cleaved with PvuII 
and BamHI respectively. This region is highly conserved 
within rice SBE X, wSBE I-D2 and wSBE I-D4 and produced ten 
bands with wheat DNA and five with T. tauschii DNA. Neither 
PvuII nor BawHT cleaved within the probe sequences, 

3 5 suggesting that each band represented a single type of SBE I 
gene. We have described four SBE I genes from T. tauschii: 
wSBE I-Dl, wSBE I-D2, wSBE I-D3 and wSBE I-D4 (Rahman et al, 
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1997 and this specification) , and so we may have accounted 
for most of the genes in T. tauschii and, by extension, the 
genes from the D genome of wheat. In wheat, at least two 
hybridising bands could be assigned to each of 
chromosomes 7 A, 7B and 7D. 



10 



15 



20 



Example 10 Tissue specificity and expression during 

endosperm development 
The 300 bp of 3' untranslated sequence of 
wSBE I-D4 cDNA does not show any homology with either the 
wSBE I-D2 type cDNA that we have described earlier (Rahman 
et al, 1997) or with BE-I from rice, as shown in Figure 5. 
We have called this sequence wSBE I-D43C (see SEQ ID No: 9). 
It seemed likely that wSBE I-D43C would be a specific probe 
for this class of SBE-I, and thus it was used to investigate 
the tissue specificity. Hybridization of RNA from endosperm 
of hexaploid T. tauschii cultures with SBE I, SBE II, SSS I, 
DBE I, wheat actin, and wheat ribosomal RNA was examined. 
RNA was purified at various numbers of days after anthesis 
from plants grown with a 16 h photoperiod at 13 °C (night) 
and 18 °C (day) . The age of the endosperms from which RNA 
was extracted in days after anthesis is given above the 
lanes in the blot. Equivalent amounts of RNA were 
electrophoresed in each lane. The probes used are identified 
25 in Tables 1 and 2. 

The results are shown in Figures 9a to 9g. An RNA 
species of about 2700 bases in size was found to hybridise. 
This is very close to the size of the wSBE I-D4 cDNA 
sequence. RNA hybridising to WSBE-I-D43C is most abundant 
at the mid-stage of endosperm development, as shown in 

Figure 9a, and in field grown material is relatively 

eons . fe ^^ w . in . g _ t . te _^^._ r2 _ K ^_ rs ^ t£me at whT _ 

there is rapid starch and storage protein accummulation 
(Morell et al, 1995) . 

35 The sequence contained within the wSBE I-D4 gene 

appears to be expressed only in the endosperm (Figure 9a, 
Figure 9b). We could not detect any expression in the leaf. 
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This could be because another isoform is expressed in the 
leaf, and/or because the amount of SBE I present in the leaf 
is much less than what is required in the endosperm. 
Isolation of SBE I clones from a leaf cDNA library would 
5 enable this question to be resolved. 

Example 11 Intron-Exon Structure of SBE I 

By comparison of the cDNA sequence of SBE I 
(Repellin et al , 1997) with that of wSBE I-D4 we can deduce 

10 the intron-exon structure of the gene for the major isoform 
of SBE I that is found in the endosperm. The structure 
contains 14 exons compared to 14 for rice (Kawasaki et al, 
1993) . These 14 exons are spread over 6 kb of sequence, a 
distance similar to that found in both rice SBE I and 

15 wSBE I-D2 . A dotplot comparison of wSBE I-D4 sequence and 
that of rice SBE I sequence, depicted in Figure 10, shows 
good sequence identity over almost the entire gene starting 
from about position 5100 of wSBE I-D4; the identity is poor 
over the first 5 kb of sequence corresponding largely to the 

20 promoter sequences. The sequence identity over introns 
(about 60%) is lower than over exons (about 85%) . 

Example 12 Repeated Sequences in SBE I 

Sequencing of wSBE I-D4 revealed there was a 

25 repeated sequence of at least 3 00 bp contained in a 2kb 

fragment about 600 bp after the 3' end of the gene. We have 
called this sequence wSBE I-D4R (SEQ ID NO: 9) . This 
repeated sequence is within fragment El . 5 (Figure 3 and 
Table 1) and is flanked by non-repetitive sequences from the 

30 genomic clone. We have previously shown that the 

restriction pattern obtained by digesting XE1 with the 
restriction enzyme BairiHl is also obtained when T. tauschii 
DNA is digested. Thus wSBE I-D4R is unlikely to be a 
cloning artefact. A search of the GenBank Database revealed 

35 that wSBE I-D4R shared no significant homology with any 

sequence in the database. Hybridisation experiments with 
wSBE I-D4R showed that all of the other SBE I-D4 type 
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genomic clones (except number 29) contained this repeated 
sequence (data not shown) . The wSBE I-D4R sequence was not 
highly repeated and occurred in the wheat genome with a 
similar frequency as the viSBE I-D4 sequence. 
5 When SBE I-D4R was used as the probe on wheat DNA 

from the nulli-tetra lines, four bands were obtained; two of 
these bands could be assigned to chromosome 7 A and the 
others to chromosomes 7b and 7D (Figure 11) . One of the two 
BamHI fragments from wheat DNA which could be assigned to 

10 chromosome 7A was distinct from the single band from 

chromosome 7A detected using wSBE I-D43 as the probe; the 
other three bands coincided in the autoradiograph with bands 
obtained with wSBE I-D43, and are likely to represent the 
same fragment. However, one of these fragments was distinct 

15 from the BamHI fragment that hybridised to the wSBE I-D43 
sequence. In wSBE I-D4 (see SEQ ID No: 9), the wSBE I-D43 
sequence is only 3 00 bp upstream of wSBE I-D4R, and occurs 
in the same BamHI fragment. These results suggest that the 
wSBE I-D4R sequence can occur independently of wSBE I-D4 in 

20 the wheat genome. 



Example 13 Isolation of Genomic Clones Encoding SBE II 

Screening of a cDNA library, prepared from the 
wheat endosperm as described in Example 4, with the maize 
25 BE I clone (Baba et al, 1991) at low stringency led to the 

isolation of two classes of positive plaques. One class was 
strongly hybridising, and led to the isolation of wheat 
SBE I-D2 type and SBE I-D4 type cDNA clones, as described in 
Example 5 and in Rahman et al (1997). The second class was. 
weakly hybridising, and one member of this class was 
purified. This weakly hybridising clone was termed SBE-9, 

and^on„sequenc_ing^as^ 

distinct from that for SBE I. This sequence showed greatest 
homology to maize BE II sequences, and was considered to 
3 5 encode part of the wheat SBE II sequence. 

The screening of" approximately 5 x 10 5 plaques 
from a genomic library constructed from T. tauschii (see 
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Example 1) with the SBE-9 sequence led to the isolation of 
four plaques that were positive. These were designated 
wSBE JJ-D1 to wSBE II-D4 respectively, and were purified and 
analysed by restriction mapping. Although they all had 
5 different hybridization patterns with SBE-9, as shown in 

Figure 12, the results were consistent with the isolation of 
the same gene in different-sized fragments. 

Example 14 Identification of the N-terminal sequence of 

10 SBE II 

Sequencing of the SBE II gene contained in 
clone 2, termed SBE II-D1 (see SEQ ID No:10), showed that it 
coded for the N-terminal sequence of the major isoform of 
SBE II expressed in the wheat endosperm, as identified by 

15 Morell ■ et al (1997). This is shown in Figure 13. 

Example 15 Intron-Exon Structure of the SBE II Gene 

In addition to encoding the N-terminal sequence of 
sBE II, as shown in Example 10, the cDNA sequence reported 
20 by Nair et al (1997) was also found to have 100% sequence 

identity with part of the sequence of wSBE II-D1. Thus the 
intron-exon structure can be deduced, and this is shown in 
Figure 14. The positions of exons and other major structural 
features of the SBE II gene are summarized in Table 2. 

25 

Example 16 Number of SBE II Genes in T. tauschii and 

Wheat 

Hybridisation of the SBE II conserved region with 
T. tauschii DNA revealed the presence of three gene classes. 
3 0 However, in our screening we only recovered one class. 
Hybridisation to wheat DNA indicated that the locus for 
SBE II was on chromosome 2, with approximately 5 loci in 
wheat; most of these appear to be on chromosome 2D, as shown 
in Figure 15 . 

35 
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Table 2 

Positions of structural features in wSBE II-D1 



A. Positions of exons . 



10 



15 



20 



25 



30 



number 


Genomic 


Genomic 




start 


finish 


1 


1058 


1336 


2 


1664 


1761 


J 


2038 


2279 


4 


2681 


2779 


5 


2949 


2997 


6 


3145 


3204 


1 


3540 


3620 


8 


3704 


3825 


Q 

y 


4110 


4188 


10 


4818 


4939 


11 


5115 


5234 


12 


6209 


6338 


13 


6427 


6549 


14 


6739 


6867 


15 


7447 


7550 


16 


8392 


8536 


17 


9556 


9703 


18 


9839 


9943 


19 


10120 


10193 


20 


10395 


10550 


21 


10928 


11002 


22 


11092 


11475 



B. 



35 



40 



45 



Other structural features within the wSBE II-D1 DNA 
sequence 



Putative initiation of translation 
Mature N- terminal sequence of SBE II 
wSBE II-D13 



Endosperm 


box 


like 


motif 


TGAAAAGT 


Endosperm 


box 


like 


motif 


TGAAAGT 


Endpsperm 


box 


like 


motif 


CGAAAAT 


Endosperm 


box 


like 


motif 


TAAATGT 


CAAAAT mot 


.if 







TCAATT motif 



TATAAA motif 
AATTAA motif 



1214 
1681 

11116 to 11448 

521 

565 

669 

768 

784 

1108 

799 
1110 
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Example 17 Expr ssion of SBE II 

Investigation of the pattern of expression of 
SBE II revealed that the gene was only expressed in the 
endosperm. However the timing of expression was quite 
5 distinct from that of SBE I, as illustrated in 
Figures 9a, 9b and 9c. 

SBE I gene expression is only clearly detectable 
from the mid-stage of endosperm development (10 days after 
anthesis in Figure 9b) , whereas SBE II gene expression is 
10 clearly seen much earlier, in endosperm tissue at 5-8 days 
after development (Figures 9a and 9c) , corresponding to an 
early stage of endosperm development. The hybridisation of 
wheat endosperm mRNA with the actin and ribosomal RNA genes 
is shown as controls (Figures 9fa and 9g, respectively). 

15 

Example 18 Cloning of Wheat Soluble Starch Synthase 

cDNA 

A conserved sequence region was used for the 
synthesis of primers for amplification of SSS I by 

20 comparison with the nucleotide sequences encoding soluble 
starch synthases of rice and pea. A 3 00 bp RT-PCR product 
was obtained by amplification of cDNA from wheat endosperm 
at 12 days post anthesis. The 300 bp RT-PCT produce was 
then cloned, and its sequence analysed. The comparison of 

25 its sequence with rice SSS cDNA showed about 80% sequence 

homology. The 3 00 bp RT-PCR product was 100% homologous to 
the partial sequence of a wheat SSS I in the database 
produced by Block et al (1997). 

The 3 00 bp cDNA fragment of wheat soluble starch 

30 synthase thus isolated was used as a probe for the screening 
of a wheat endosperm cDNA library (Rahman et al, 1997). 
Eight cDNA clones were selected. One of the largest cDNA 
clones (sm2) was used for DNA sequencing analysis, and gave 
a 2662 bp nucleotide sequence, which is shown in SEQ ID 

35 NO: 14. A large open reading frame of this cDNA encoded a 
647 amino acid polypeptide, starting at nucleotides 247 to 
250 and terminating at nucleotides 2198 to 2200. The 
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deduced polypeptide was shown by protein sequence analysis 
to contain the N-terminal sequence of a 75 kDa granule-bound 
protein (Rahman et al, 1995). This is illustrated in 
Figure 16. The location of the 75 kDa protein was 
determined for both the soluble fraction and starch granule- 
bound fraction by the method of Denyer et al (1995) . Thus 
this cDNA clone encoded a polypeptide comprising a 41 amino 
acid transit peptide and a 606 amino acid mature peptide 
(SEQ ID NO: 12) . The cleavage site LRRL was located at amino 
acids 36 to 3 9 of the transit peptide of this deduced 
polypeptide. 

Comparison of wheat SSS I with rice SSS and potato 
SSS showed that there is 87.4% or 75.9% homology at the 
amino acid level and 74.7% or 58 . 1% .homology at the 
15 nucleotide level. Some amino acids in the at N-terminal 
sequences of the SSS I of wheat and rice were conserved. 
Major features of the SSS I gene are summarized in Table 3. 

Example 19 Isolation of Genomic Clone of Wheat Soluble 

20 Starch Synthase 

Seven genomic clones were obtained with a 3 00 bp 
cDNA probe by screening approximately 5 x 10 5 plaques from a 
genomic DNA library of Triticum tauschii, as described 
above. DNA was purified from 5 of these clones and digested 

25 with BaiTiHI and Sad. Southern hybridization analysis using 
the 3 00 bp cDNA as probe showed that these clones could be 
classified into two classes, as shown in Figure 17. One 
genomic clone, sg3 , contained a long insert, and was 
digested with BamHI or SacI and subcloned into pBluescript 

30 KS+ vector. 
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Table 3 

Comparison of exons and introns of soluble starch synthases 

I genes of wheat and rice 

(1) Identity of exons of soluble starch synthase I genes of 
5 wheat and rice 



Exons wSSI-Dl rSSI identity (%) start site stop site 

(wSSI-Dl) (wSSI-Dl) 





la 


255 


113 


57 . 


.52 


-253 


0 


10 


lb 


316 


298 


58 . 


.92 


1 


316 




2 


356 


356 


82 . 


.87 


1473 


1828 




3 


78 


78 


92 . 


.31 


2746 


2823 




4 


125 


125 


90 . 


. 40 


2906 


3028 




5 


82 


82 


89 . 


. 02 


4113 


4194 


15 


6 


174 


174 


93 . 


. 10 


4286 


4459 




7 


82 


82 


93 . 


. 90 


4562 


4643 




8 


92 


92 


92 , 


.39 


4743 


4835 




9 


63 


63 


90 . 


.48 


4959 


5021 




10 


90 


90 


82 . 


.22 


5103 


5192 


20 


11 


125 


125 


88 


. 80 


8594 


8718 




12 


109 


109 


91 


.74 


8807 


8915 




13 


53 


53 


81 


. 13 


8992 


9044 




14 


40 


41 


80 


. 00 


9160 


9199 




15a 


159 


113 


79 


. 65 


9499 


9657 


25 


15b 


392 


539 


46 


.46 


9658 


10098 



(2) Identity of introns of soluble starch synthase I genes 
of wheat and rice 



30 


Introns 


wSSI-Dl 


rSSI 


identity (%) 


start 


site stop 












(wSSI 


-Dl) (wSSI 




1 


1156 


907 


41 . 05 


317 


1472 




2 


917 


851 


41 . 65 


1829 


2745 




3 


82 


87 


45 . 12 


2824 


2905 


35 


4 


1084 


835 


48 . 50 


3029 


4112 




5 


91 


96 


57 .78 


4195 


4285 




6 


102 


189 


52 .48 


4460 


4561 




7 


99 


96 


52 .08 


4644 


4742 




8 


123 


110 


45 . 46 


4836 


4958 


40 


9 


81 


78 


58 . 97 


5022 


5102 




10 


3401 


663 


37 .56 


5193 


8593 




11 


88 


124 


56 .82 


8719 


8806 




12 


76 


81 


48.68 


8916 


8991 




13 


115 


135 


45.22 


9045 


9159 


45 


14 


299 


830 


45.80 


9200 


9498 



Note: Exon la: non-coding region of exon 1. Exon lb: coding 
region of exon 1 . 

Exon 15a: coding region of exon 15. Exon 15b: non- 
coding region of exon 15. 
50 wSSI-Dl: wheat soluble starch synthase I gene. 

rSSI: rice soluble starch synthase I gene. 
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These subclones were analysed by sequencing. The 
intron/exon structure of the sg3 rice gene is shown in 
Figure 18. The SSS I gene from T . tauschii is shown in SEQ 
ID No: 13, while the deduced amino acid sequence is shown ij 
5 SEQ ID NO: 14 . 



10 



15 



20 



25 



30 



35 



Example 2 0 Northern Hybridization Analysis of the 

Expression o f Genes Encoding Soluble Starch 
Synthase 

Total RNAs were purified from leaves, pre-anthesis 
material, and various stages of developing endosperm at 5-8, 
10-15 and 18-22 days post anthesis. Northern hybridizacion 
analysis showed that mRNAs encoding wheat SSS I were 
specifically expressed in developmental endosperm. 
Expression of this mRNAs in the leaves and pre-anthesis 
materials could not be detected by northern hybridization 
analysis under this experimental condition. Wheat SSS I 
mRNAs started to express at high levels at an early stage of 
endosperm, 5-8 days post anthesis, and the expression level 
in endosperm at 10-15 days post anthesis, was reduced. 
These results are summarized in Figure 9a and Figure 9d. 

Example 21 Genomic Localisation of wheat Soluble Starch 

Synthase 

DNA from chromosome engineered lines was digested 
with the restriction enzyme BamHI and blotted onto supported 
nitrocellulose membranes. A probe prepared from the 3' end 
of the cDNA sequence, from positions 2345 to 2548, was used 
to hybridise to this DNA. The presence of a specific band . 
was shown to be associated with the presence of 
chromosomes 7A (Figure 19). These data demonstrate location 
_af_the_SSS-l_ gene-on-chromosome-7r— — — — — — 



Example 22 Isolation of SSS I Promoter 

We have isolated the promoter that drives this 
pattern of expression for SSS I. The pattern of expression 
for SSS I is very similar to that for SBE II: the SSS I gene 
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transcript is detectable from an early stage of endosperm 
development until the endosperm matures. The sequence of 
this promoter is given in SEQ ID No: 15. 

5 Example 2 3 Isolation of the Gene Encoding Debranching 

Enzyme from Wheat 
The sugary-1 mutation in maize results in mature 
dried kernels that have a glassy and translucent appearance; 
immature mature kernels accumulate sucrose and other simple 

10 sugars, as well as the water-soluble polysaccharide 

phytoglycogen (Black et al, 1966). Most data indicates that 
in sugary-1 mutants the concentration of amylose is 
increased relative to that of amylopec tion . Analysis of a 
particular sugary-1 mutation {su-lRef) by James et al , 

15 (1995) led to the isolation of a cDNA that shared 

significant sequence identity with bacterial enzymes that 
hydrolyse the a 1,6-glucosyl linkages of starch, such as an 
isoamylase from Pseudomonas (Amemura et al, 1988), ie. 
bacterial debranching enzymes. 

20 We have now isolated a sequence amplified from 

wheat endosperm cDNA using the polymerase chain reaction 
(PCR) . This sequence is highly homologous to the sequence 
for the sugary gene isolated by James et al, (1995) . This 
sequence has been used to isolate homologous cDNA sequences 

2 5 from a wheat endosperm library and genomic sequences from 
Triticum tauschii. 

Comparison of the deduced amino acid sequences of 
DBE from maize with spinach (Accession SOPULSPO, GenBank 
database), Pseudomonas (Amemura et al , 1988) and rice 

30 (Nakamura et al,. 1997) enabled us to deduce sequences which 
could be useful in wheat. When these sequences were used as 
PCR amplification primers with wheat genomic DNA a product 
of 2 56 bp was produced. This was sequenced and was compared 
to the sequence of maize sugary isolated by James et al, 

35 (1995) . The results are shown, in Figure 20a and Figure 20b. 
This sequence has been termed wheat debranching enzyme 
sequence I (WDBE-I) . 
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10 



WDBE-1 was used to investigate a cDNA library 
constructed from wheat endosperm (Rahman et al, 1997) 
enables us to isolate two cDNA clones which hybridise 
strongly to the WDBE-I probe. The nucleotide sequence of 
the DNA insert in the longest of these clones is given in 
SEQ ID No: 16 . 

Use of WDBE 1 to investigate a genomic library 
constructed from T. tauschii, as described above has led to 
the isolation of four genomic clones, designated II , 12, 13 
and 14, respectively, which hybridised strongly to the 
WDBE-I sequence. These clones were shown to contain copies 
of a single debranching enzyme gene. The sequence of one of 
these clones, 12, is given in SEQ ID No: 17. The intron/exon 
structure of the gene is shown in Figure 20c. Exons 1 to 4 
15 were identified by comparison with the maize sugary-1 cDNA, 
while Exons 5 to 18 were identified by comparison with the 
cDNA sequence given in SEQ ID No: 16. The major features of 
the DBE I gene are summarized in Table 4. 

Hybridization of WDBE-I to DNA from T . tauschii 
20 indicates one hybridizing fragment (Figure 21a) . The 
chromosomal location of the gene was shown to be on 
chromosome 7 through hybridisation to nullisomic/tetrasomic 
lines of the hexaploid wheat cultivar Chinese Spring 
(Figure 21b) . 

25 We have clearly isolated a sequence from the wheat 

genome that has high identity to the debranching enzyme cDNA 
of maize characterised by James et al (1997). The isolation 
of homologous cDNA sequences and genomic sequences enables 
further characterisation of the debranching enzyme cDNA and 
promoter sequences from wheat and T. tauschii. These 
sequences and the WDBE I sequences shown herein are useful 
in_£hfi_ manipulation- of -whea-t-sfeare-h-s true ture-througtr 



30 



genetic manipulation and in the screening for mutants at the 
equivalent sugary locus in wheat. 
35 Figure 9e shows that the DBE I gene is expressed 

during endosperm development in wheat and that the timing of 
expression is similar to the SBEII and SSSI genes. Figure 9h 
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shows that the full length mRNA for the gene (3.0 kb) is 
found only in the wheat endosperm. 

Example 2 4 Transient assays of Promoter-GFP Fusions 

5 DNA constructs 

DNA constructs for transient expression assays 
were prepared by fusing sequences from the BEII and SSI 
promoters to the gene encoding the Green Fluorescent 
Protein. Green Fluorescent Protein (GFP) constructs 
10 contained the GFP gene described by Sheen et al . (1995) . The 
nos 3' element (Bevan et al . , 1983) was inserted 3' of the 
GFP gene. The plasmid vector (pWGEM_NZfp) was constructed by 
inserting the NotI to Hindlll fragment from the following 
sequence : 

15 

5' GCGGCCGCTC CCTGGCCGAC TTGGCCGAAG CTTGCATGCC TGCAGGTCGA 
CTCTAGAGGA TCCCCGGGTA CCGAGCTCGA ATTCATCGAT GATATCAGAT 
CCGGGCCCTC TAGATGCGGC CGCATGCATA AGCTT 3 ' 

20 into the NotI and Hindi I I sites of pGem-13Zf {-) vector 

(Promega) . The sequences at the junction of the wSSSIprol 
and wSSSIpro2 and GFP were identical, and included the 
j unction sequence : 

2 5 5 ' . . . . CGCGCGCCCA CACCCTGCAG GTCGACTCTA GAGGATCCAT GGTGAGCAAG 

3 ' . 

. The sequence at the junction of wsbellprol and GFP was: 

3 0 5 ' GCGACTGGCT GACTCAATCA CTACGCGGGG ATCCATGGTG AGCAAGGGCG 

3 ' . 

The sequence at the junction of wsbeIIpro2 and GFP was: 

5' GGACTCCTCT CGCGCCGTCC TGAGCCGCGG ATCCATGGTG AGCAAGGGCG 
35 3'. 

The structures of the constructs are shown in Figures 22a to 

22f . . 
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Table 4 

Structural features of wDBEI-Dl 



Exon 


Start 


End 


number 


positi 


posit 




on 


ion 


1 


1890 


2241 


2 


2342 


2524 


3 


2615 


2707 


4 


3016 


3168 


5 


3360 


3436 


6 


4313 


4454 


7 


4526 


4633 


8 


4734 


4819 


9 


5058 


5129 


10 


5202 


5328 


11 


5558 


5644 


12 


6575 


6671 


13 


7507 


7661 


14 


8450 


8527 


15 


8739 


8823 


16 


8902 


8981 


17 


9114 


9231 


18 


Still 






being 






sequen 






ced 





Comments 



(deduced by comparison with maize) 
(deduced by comparison with maize) 
(deduced by comparison with maize) 
(deduced by comparison with maize) 



Note that following nucleotides 3330, 6330 and 8419 there 
may be short regions of DNA not yet sequenced. 



B. 

CAAAAT motif 
10 TCAAT motif 

ATAAATAA motif 

Endosperm box like motif TAAAACG 



1833 
1838 
1804 

1463 
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Preparation of target tissue 

All explants used for transient assay were from 
the hexaploid wheat cultivar, Milliwang. Endosperm (10-12 
days after anthesis), embryos (12 - 14 days after anthesis) 
5 and leaves (the second leaf from the top of plants 

containing 5 leaves) were used. Developing seed or leaves 
were collected, surface sterilized with 1.25% w/v sodium 
hypochlorite for 20 minutes and rinsed with sterile 
distilled water 8 times. Endosperms or embryos were 

10 carefully excised from seed in order to avoid contamination 
with surrounding tissues. Leaves were cut into 0.5 cm x 1 
cm pieces. All tissues were aseptically transferred onto 
SD1SM medium, which is an MS based medium containing 1 mg/L 
2,4-D, 150 mg/L L-asparagine , 0.5 mg/L thiamine, 10 g/L 

15 sucrose, 36 g/L sorbitol and 36 g/L mannitol . Each agar 

plate contained either 12 endosperms, 12 embros or 2 leaf 
segments . 

Preparation of gold particles and bombardment 

20 Five jig of each plasmid was used for the 

preparation of gold particles, as described by Witrzens et 
al. (1998). Gold particle-DNA suspension in ethanol (10 |xl) 
was used for each bombardment using a Bio-Rad helium-driven 
particle delivery system, PDS-1000. 

25 

GFP assay 

The expression of GFP was' observed after 3 6 to 72 
hours incubation using a fluorescence microscope. Two plates 
were bombarded for each construct. The numbers of expressing 
30 regions were recorded for each target tissue, and are 

summarized in Table 5. The intensity of the expression of 
GFP from each of the promoters was estimated by visual 
comparison of the light intensity emitted, and is summarized 
in. Table 6. 

3 5 The DNA construct, containing GFP without a 

promoter region (pZLGFPNot) gave no evidence of transient . 
expression in embryo (panel 1) or leaf (panel r) and 
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extremely weak and sporadic expression in endosperm (panel 
f) , this construct gave only very weak expression in 
endosperm with respect to the number (Figure 5) and 
intensity (Figure 6) of transient expression regions. The 
5 constructs pwssslprolgf pNOT (panels b, h and n) . 

psbellprolgfpNOT (panels d, j and p) , and P sbeIIpro2gfpNOT 
(panels e, k and q) yielded low numbers (Table 5) of 
strongly (Table 6) expressing regions in leaves, and there 
was a very uneven distribution of expressing regions between 
target leaf pieces (Table 5). pwsssIpro2gf pNOT (panels c, i 
and o) gave no evidence of transient expression in leaves 
(Table 5). These results show that each of the promoter 
constructs is able to drive the transient expression of GFP 
in the grain tissues, endosperm and embryo. The ability of 
the short SSI promoter (pwsssIpro2gfpNOT containing 1042 bp 
5' of the ATG translation start site) to drive expression in 
leaves (panel n) contrasts with the inability of the long 
SSI promoter (pwsssIpro2gf pNOT containing 3914 base pair 
region 5' of the ATG translation start site, panel o) ) 
suggesting that regions for controlling tissue specificity 
are located between -3914 and -1042 of the SSI promoter 
region (SEQ ID No: 15). 

Example 25 Stable transformation of rice 

25 Stable transformation of rice using Agrobaccerium 

was carried out essentially as described by Wang et al . 
1997 . The plasmids containing the target DNA constructs 
containing the promoter-reporter gene fusions are shown in 
Figure 23. These plasmids were transformed into 
Agrobacterium tumefaciens AGL1 by electroporation . and 
cultured on selection plates of LB media containing 

rifamiuc.illin^^ 

3 days, and then gently suspended in 10 ml NB liquid medium 
containing 100 p.M acetosyringone and mixed well. Embryogenic 
35 rice calli (2 to 3 months old) derived from mature seeds 
were immersed in the A. tumefaciens AGL1 



20 



30 
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Table 6 

Comparison of the Intensities of Transient Expression 



Tissue 



Endosperm 

Embryo 

Leaf 



pact_j 

s - 
gf g_no 

s 

10 
10 
10 



pwsssl pwsssl psbell 



prolgf 
pNOT 

4 
5.5 
20 



pro2gf 
pNOT 
2.5 
5.5 
0 



prolgf 
pNOT 
3.5 
1.5 
10 



psbell pZLGFP 
Not 

pro2gf 
pNOT 

1-5 0.5 

1 0 

10 0 



All intensities are relative to pac t_j s -gf g„nos transient 
expression in the target tissue 

Relative intensities were independently scored by three 
researchers and averaged. 



BNSDOC1D: <WO 9914314A1 I > 



WO 99/14314 v., ; PCT/AU98/00743 

suspension. After 3-10 minutes the A. tumefaciens AGL1 
suspension medium was removed, and the rice calli were 
transferred to NB medium containing 100 |IM acetosyringone 
for 48 h. The co-cultivated calli were washed with sterile 
5 Milli Q H 2 0 containing 150 mg/L timentin 7 times to remove 
all -Agrrojbacteriu/n, plated on to NB medium containing 150 
mg/L timentin and 3 0 mg/L hygromycin, and cultured for 3 to 
4 weeks. Newly-formed buds on the surface of rice calli were 
excised and plated onto NB Second Selection medium 

10 containing 150 mg/L timentin and 50 mg/L hygromycin. After 4 
weeks of proliferation calli were plated onto NB Pre- 
Regeneration medium containing 150 mg/L timentin and 50 mg/L 
hygromycin, and cultured for 2 weeks. The calli were then 
transferred on to NB-Regenerat ion medium containing 150 mg/L 

15 timentin and 50 mg/L hygromycin for 3 to 4 weeks. Once 

shooting occurs, shoots are transferred onto rooting medium 
(% MS) containing 5 0 mg /L hygromycin. Once adequate root 
formation occurs, the seedlings are transferred to soil, 
grown in a misting chamber for 1-2 weeks, and grown to 

20 maturity in a containment glasshouse. 

Example 2 6 Use of probes from SSS 1, SBE I, SBE II and 

DBE sequences to identify null or altered 
alleles for use in breeding programmes 

25 DNA primer sets were designed to enable 

amplification of the first 9 introns of the SBE II gene 
using PCR. The design of the primer sets is illustrated in 
Figure 24. Primers were based on the wSBE II-D1 sequence 
(deduced from Figure 13b and Nair et al, 1997; Accession No. 

30 Y11282) and were designed such that intron sequences in the 
wSBE II sequence were amplified by PCR. These primer sets 
individually amplify the first 9 introns of SBE II. One 
primer (sr913F) contained a fluorescent label at the 5' end. 
Following amplification, the products were digested with the 

3 5 restriction enzyme Ddel and analysed using an ABI 377 DNA 
Sequencer with Genescan™ fragment analysis software. One 
primer set, for intron 5, was found to amplify products from 
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each of chromosomes 2A, 2B and 2D of wheat. This is shown in 
Figure 25, which illustrates results obtained with various 
wheat lines, and demonstrates that products from each of the 
wheat genomes from diverse wheats were amplified, and that 
5 therefore lines lacking the wSBEII gene on a specific 
chromosome could be readily identified. Lane (iii) 
illustrates the identification of the absence of the A 
genome wSBEII gene from the hexaploid wheat cultivar Chinese 
Spring ditelosomic line 2AS . 

10 Figure 26 compares results of amplification with 

an Intron 10 primer set for various nullisomic/ tetrasomic 
lines of the hexaploid wheat Chinese Spring. Fluorescent 
dUTP deoxynucleotides were included in the amplification 
reaction. Following amplification, the products were 

15 digested with the restriction enzyme Ddel and analysed using 
an ABI 377 DNA Sequencer with Genescan™ fragment analysis 
software. In lane (i) Chinese Spring ditelosomic line 2 AS , a 
300 base product is absent; in lane (ii) N2BT2A, a 204 base 
product is absent, and in lane (iii) N2DT2B a 191 base 

20 product is absent. These results demonstrate that the 
absence of specific wSBEII genes on each of the wheat 
chromosomes can be detected by this assay. Lines lacking 
wSBEII forms can be used as a parental line for breeding 
programmes for generation of new lines in which expression 

25 of SBE II is diminished or abolished, with consequent 

increase in amylose content of the wheat grain. Thus a high 
amylose wheat can be produced. 

Table 7 shows examples primers pairs for SBE I, 
SSS I and DBE I which can identify genes from individual 

30 wheat genomes and could therefore be used to identify lines 
containing null or altered alleles. Such tests could be used 

__ to enable the dev elopme nt of whea t lines carryin g nu ll 

mutations in each of the genomes for a specific gene (for 
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example SBEI, SSI or DBE I, or combinations of null alleles 
for different genes. 

It will be apparent to the person skilled in the 
art that while the invention has been described in some 
5 detail for the purposes of clarity and understanding, 

various modifications and alterations to the embodiments and 
methods described herein may be made without departing from 
the scope of the inventive concept disclosed in this 
specification. 

10 Reference cited herein are listed on the following 

pages, and are incorporated herein by this reference. 
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SEQUENCE LISTING 



( I ) GENERAL INFORMATION: 



10 



15 



20 



25 



30 



(i) APPLICANT: 

(A) NAME: COMMONWEALTH SCIENTIFIC AND INDUSTRIAL 
RESEARCH ORGANISATION 

(B) STREET: Limestone Avenue 

(C) CITY: Campbell 

(D) STATE: ACT 

(E) COUNTRY: AUSTRALIA 

(F) POSTAL CODE (ZIP): 2612 

A) NAME: THE AUSTRALIAN NATIONAL UNIVERSITY 

B) STREET: BRIAN LEWIS CRESCENT 

C) CITY: ACTON 

D) STATE: ACT 

E) COUNTRY: AUSTRALIA 

F) POSTAL CODE (ZIP): 2601 

A) NAME: GOODMAN FIELDER LIMITED 

B) STREET: LEVEL 42, GROSVENOR PLACE 

C) CITY: SYDNEY 

D) STATE: NSW 

E) COUNTRY: AUSTRALIA 

F) POSTAL CODE (ZIP): 2000 

A) NAME: GROUPE LIMAGRAIN PACIFIC PTY LIMITED 

B) STREET: LEVEL 31, I O'CONNELL STREET 

C) CITY: SYDNEY 

D) STATE: NSW 

E) COUNTRY: AUSTRALIA 

F) POSTAL CODE (ZIP): 2000 



35 



(it) TITLE OF INVENTION: REGULATION OF GENE EXPRESSION IN PLANTS 



(iii) NUMBER OF SEQUENCES: 17 

(iv) COMPUTER READABLE FORM: 
4 0 (A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1 .0. Version #1 .30 (EPO) 



45 



50 



55 



(2) INFORMATION FOR SEQ ID NO: 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "per primer based on the N-terminal sequence of wSBE I 5 ' end at 
position 1 68 of SEQ ID NO:5" 

(iii) HYPOTHETICAL: NO 
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(iv) ANTI-SENSE: 

(v) FRAGMENT TYPE: 

5 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: triticum tauschii 
(F) TISSUE TYPE: Endosperm 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

10 

GGC ACGCG AG AG ACTGG 1 7 

(2) INFORMATION FOR SEQ ID NO: 2: 
(i) SEQUENCE CHARACTERISTICS: 
1 5 (A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

2 0 (ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "per primer in which 5 1 end is at position 1590 of SEQ ID NO:5 

(iii) HYPOTHETICAL: NO 

2 5 (iv) ANTI-SENSE: 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

3 0 (A) ORGANISM: triticum tauschii 

(F) TISSUE TYPE: Endosperm 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

3 5 TACATTTCCT TGTCCATCA 19 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 18 base pairs 

4 0 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

4 5 ( A ) DESCRIPTION: /desc = "per primer 5 • end is at position 1 of SEQ ID NO:5" 

(iii) HYPOTHETICAL: NO 

—(iv) ANTI-SENSE: — — — — — — r— — 

50 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: triticum tauschii 

5 5 (F) TISSUE TYPE: Endosperm 
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18 



% 

WO 99/14314 v 

- 63 - 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
ATCACGAGAG CTTGCTCA 

5 (2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
1 0 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "per primer 5 ' end is at position 334 of SEQ ID NO:5 

1 5 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: 

(v) FRAGMENT TYPE: 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: triticum tauschii 
(F) TISSUE TYPE: Endosperm 



2 5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

CGGTACACAG TTGCGTCATT TTC 23 

(2) INFORMATION FOR SEQ ID NO: 5: 

3 0 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2687 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

35 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

40 (iv) ANTI-SENSE: 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: triticum tauschii 
(F) TISSUE TYPE: Endosperm 

45 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

ATCGACGAAG ATGCTCTGCC TCACCGCCCC CTCCTGCTCG CCATCTCTCC CGCCGCGCCC 60 
50 CTCCCGTCCC GCTGCTGACC GGCCCGGACC GGGGATTTCG GCCAAGAGCA AGTTCTCTGT 12 0 

TCCCGTGTCT GCGCCAAGAG ACTACACCAT GGCAACAGCT GAAGATGGTG TTGGCGACCT 180 

TCCGATATAC GATCTGGATC CGAAGTTTGC CGGCTTCAAG GAACACTTCA GTTATAGGAT 240 
55 GAAAAAGTAC CTTGACCAGA AACATTCGAT TGAGAAGCAC GAGGGAGGCC TTGAAGAGTT 3 00 

CTCTAAAGGC TATTTGAAGT TTGGGATCAA CACAGAAAAT GACGCAACTG TGTACCGGGA 3 60 
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10 



20 



ATGGGCCCCT GCAGCAATGG ATGCACAACT TATTGGTGAC TTCAACAACT GGAATGGCTC 420 
TGGGCACAGG ATGACAAAGG ATAATTATGG TGTTTGGTCA ATCAGGATTT CCCATGTCAA 480 
TGGGAAACCT GCCATCCCCC ATAATTCCAA GGTTAAATTT CGATTTCACC GTGGAGATGG 540 
ACTATGGGTC GATCGGGTTC CTGCATGGAT TCGTTATGCA ACTTTTGACG CCTCTAAATT 600 
TGGAGCTCCA TATGACGGTG TTCACTGGGA TCCACCTTCT GGTGAAAGGT ATGTGTTTAA 660 
GCATCCTCGG CCTCGAAAGC CTGACGCTCC ACGTATTTAC GAGGCTCATG TGGGGATGAG 720 
is TGGTGAGAGG CCTGAAGTAA GCACATACAG AGAATTTGCA GACAATGTGT TACCGCGCAT 780 
AAAGGCAAAC AACTACAACA CAGTTCAGCT GATGGCAATC ATGGAACATT CCATATTATG 840 
CTTCTTTTGG TACCATGTGA CGAATTTCTT CGCAGTTAGC AGCAGATCAG GAACACCAGA 900 
GGACCTCAAA TATCTTGTTG ACAAGGCACA TAGCTTAGGG TTGCGTGTTC TGATGGATGT 960 
TGTCCATAGC CATGCGAGCA GTAATATGAC AGATGGTCTA AATGGCTATG ATGTTGGACA 1020 
25 AAACACACAG GAGTCCTATT TCCATACAGG AGAAAGGGGT TATCATAAAC TGTGGGATAG 1080 
TCGCCTGTTC AACTATGCCA ATTGGGAGGT CTTACGGTAT CTTCTTTCTA ATCTGAGATA 1140 
TTGGATGGAC GAATTCATGT TTGACGGCTT CCGATTTGAT GGAGTAACAT CCATGCTATA 1200 
TAATCACCAT GGTATCAATA TGTCATTCGC TGGAAATTAC AAGGAATATT TTGGTTTGGA 1260 
TACCGATGTA GATGCAGTTG TTTACATGAT GCTTGCGAAC CATTTAATGC ACAAAATCTT 1320 
35 GCCAGAAGCA ACTGTTGTTG CAGAAGATGT TTCAGGCATG CCAGTGCTTT GTCGGTCAGT 13 80 
TGATGAAGGT GGAGTAGGGT TTGACTATCG CCTTGCTATG GCTATTCCTG ATAGATGGAT 1440 
TGACTACTTG AAGAACAAAG ATGACCTTGA ATGGTCAATG AGTGCAATAG CACATACTCT 1500 
GACCAACAGG AGATATACGG AAAAGTGCAT TGCATATGCT GAGAGCCACG ATCAGTCTAT 1560 
TGTTGGCGAC AAGACTATGG CATTTCTCTT GATGGACAAG GAAATGTATA CTGGCATGTC 1620 
45 AGACTTGCAG CCTGCTTCAC CTACAATTGA TCGTGGAATT GCACTTCAAA AGATGATTCA 1680 
CTTCATCACC ATGGCCCTTG GAGGTGATGG CTACTTGAAT TTTATGGGTA ATGAGTTTGG 1740 
CCACCCAGAA TGGATTGACT TTCCAAGAGA AGGCAACAAC TGGAGTTATG ATAAATGCAG 1800 
ACGCCAGTGG AGCCTCTCAG ACATTGATCA CCTACGATAC AAGTACATGA ACGCATTTGA 1860 
TCAAGCAATG AATGCGCTCG ACGACAAGTT TTCCTTCCTA TCGTCATCAA AGCAGATTGT 1920 
55 CAGCGACATG AATGAGGAAA AGAAGATTAT TGTATTTGAA CGTGGAGATC TGGTCTTCGT 1980 
CTTCAATTTT CATCCCAGTA AAACTTATGA TGGTTACAAA GTCGGATGTG ATTTGCCTGG 2040 
.GAAGTACAAG— GTAGGTGTGG— AGTeSGATGC- "TCTGATGTTT~GGTGGAT2ATG~^AAG^^TGGC — 2T00 



30 



40 



50 



60 



CCAGTACAAC GATCACTTCA CGTCACCTGA AGGAGTACCA GGAGTACCTG AAACAAACTT 2160 
CAACAACCGC CCTAATTCAT TCAAAGTCCT GTCTCCACCC CGCACTTGTG TGGCTTACTA 2220 
65 TCGCGTCGAG GAAAAAGCGG AAAAGCCTAA GGATGAAGGA GCTGCTTCTT GGGGCAAAGC 2280 
TGCTCCTGGG TACATCGATG TTGAAGCCAC TCGTGTCAAA GACGCAGCAG ATGGTGAGGC 2340 
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GACTTCTGGT TCCAAAAAGG CGTCTACAGG AGGTGACTCC AGCAAGAAGG GAATTAACTT 2 4 00 

TGTCTTCGGG TCACCTGACA AAGATAACAA ATAAGCACCA TATCAACGCT TGATCAGAAC 2460 

5 CGTGTACCGA CGTCCTTGTA ATATTCCTGC TATTGCTAGT AGTAGCAATA CTGTCAAACT 2 520 

GTGCAGACTT GAGATTCTGG CTTGGACTTT GCTGAGGTTA CCTACTATAT AGAAAGATAA 25 80 

ATAAGAGGTG ATGGTGCGGG TCGAGTCCGG CTATATGTGC CAAATATGCG CCATCCCGAG 2 640 

10 

TCCTCTGTCA TAAAGGAAGT TTCGGGCTTT CAGCCCAGAA TAAAAAA 2 6 87 

(2) INFORMATION FOR SEQ ID NO: 6: 
(i) SEQUENCE CHARACTERISTICS: 
1 5 (A) LENGTH: 807 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

2 0 (ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: 

25 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: triticum tauschii 
(F) TISSUE TYPE: Endosperm 

3 0 (ix) FEATURE: 

(A) NAME/KEY: Protein 

(B) LOCATION: I. .807 

(D) OTHER INFORMATION:/labcl= sbel 
/note= "deduced amino acid sequence from SEQ ID NO:5" 

35 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Met Leu Cys Leu Thr Ala Pro Ser Cys Ser Pro Ser Leu Pro Pro Arg 
15 10 15 

40 

Pro Ser Arg Pro Ala Ala Asp Arg Pro Gly Pro Gly lie Ser Ala Lys 
20 25 30 

Ser Lys Phe Ser Val Pro Val Ser Ala Pro Arg Asp Tyr Thr Met Ala 
45 35 40 45 

Thr Ala Glu Asp Gly Val Gly Asp Leu Pro He Tyr Asp Leu Asp Pro 
50 55 60 

50 Lys Phe Ala Gly Phe Lys Glu His Phe Ser Tyr Arg Met Lys Lys Tyr 

65 70 75 80 

Leu Asp Gin Lys His Ser He Glu Lys His Glu Gly Gly Leu Glu Glu 
85 90 95 

55 

Phe Ser Lys Gly Tyr Leu Lys Phe Gly He Asn Thr Glu Asn Asp Ala 
100 105 110 

Thr Val Tyr Arg Glu Trp Ala Pro Ala Ala Met Asp Ala Gin Leu He 
60 115 120 125 
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Gly Asp Phe Asn Asn Trp Asn Gly Ser Gly His Arg Met Thr Lys Asp 
"0 135 140 

Asn Tyr Gly Val Trp Ser He Arg He Ser His Val Asn Gly Lys Pro 
145 150 155 160 

Ala He Pro His Asn Ser Lys Val Lys Phe Arg Phe His Arg Gly Asp 
165 170 175 

Gly Leu Trp Val Asp Arg Val Pro Ala Trp He Arg Tyr Ala Thr Phe 
180 185 190 

Asp Ala Ser Lys Phe Gly Ala Pro Tyr Asp Gly Val His Trp Asp Pro 
195 200 205 

Pro Ser Gly Glu Arg Tyr Val Phe Lys His Pro Arg Pro Arg Lys Pro 
210 215 220 



As P Ala Pro Arg He Tyr Glu Ala His Val Gly Met Ser Gly Glu Arg 
20 2 25 230 235 240 

Pro Glu Val Ser Thr Tyr Arg Glu Phe Ala Asp Asn Val Leu Pro Arg 
245 250 255 

25 Ile L y s Ala Asn Asn Tyr Asn Thr Val Gin Leu Met Ala He Met Glu 

260 265 270 



His Ser Ile Leu Cys Phe Phe Trp Tyr His Val Thr Asn Phe Phe Ala 

275 280 . 285 

Val Ser Ser Arg Ser Gly Thr Pro Glu Asp Leu Lys Tyr Leu Val Asp 

290 295 300 



Ala His Ser Leu Gly Leu Arg Val Leu Met Asp Val Val His Ser 
35 305 310 315 320 

His Ala Ser Ser Asn Met Thr Asp Gly Leu Asn Gly Tyr Asp Val Gly 
325 330 335 

40 Gin Asn Thr Gin Glu Ser Tyr Phe His Thr Gly Glu Arg Gly Tyr His 

340 345 350 

Lys Leu Trp Asp Ser Arg Leu Phe Asn Tyr Ala Asn Trp Glu Val Leu 
355 360 365 

Arg Tyr Leu Leu Ser Asn Leu Arg Tyr Trp Met Asp Glu Phe Met Phe 
370 375 380 

Asp Gly Phe Arg Phe Asp Gly Val Thr Ser Met Leu Tyr Asn His His 
50 385 390 395 400 

Gly He Asn Met Ser Phe Ala Gly Asn Tyr Lys Glu Tyr Phe Gly Leu 
405 410 415 



"5"5 Asp Thr Asp" Val Asp Ala Val Val Tyr Met Met Leu Ala Asn His Leu 

420 425 430 



Met His Lys Ile Leu Pro Glu Ala Thr Val Val Ala Glu Asp Val Ser 

435 440 445 

Gly Met Pro Val Leu Cys Arg Ser Val Asp Glu Gly Gly Val Gly Phe 
450 455 460 
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Asp Tyr Arg Leu Ala Met Ala lie Pro Asp Arg Trp lie Asp Tyr Leu 

465 470 475 480 

Lys Asn Lys Asp Asp Leu Glu Trp Ser Met Ser Ala lie Ala His Thr 
5 485 490 495 

Leu Thr Asn Arg Arg Tyr Thr Glu Lys Cys lie Ala Tyr Ala Glu Ser 
500 505 510 

10 His Asp Gin Ser lie Val Gly Asp Lys Thr Met Ala Phe Leu Leu Met 

515 520 525 



15 



30 



45 



60 



Asp Lys Glu Met Tyr Thr Gly Met Ser Asp Leu Gin Pro Ala Ser Pro 
530 535 540 

Thr lie Asp Arg Gly lie Ala Leu Gin Lys Met lie His Phe lie Thr 

545 550 555 560 



Met Ala Leu Gly Gly Asp Gly Tyr Leu Asn Phe Met Gly Asn Glu Phe 
20 565 570 575 

Gly His Pro Glu Trp lie Asp Phe Pro Arg Glu Gly Asn Asn Trp Ser 
580 585 590 

2 5 Tyr Asp Lys Cys Arg Arg Gin Trp Ser Leu Ser Asp lie Asp His Leu 

595 600 605 



Arg Tyr Lys Tyr Met Asn Ala Phe Asp Gin Ala Met Asn Ala Leu Asp 
610 615 620 

Asp Lys Phe Ser Phe Leu Ser Ser Ser Lys Gin lie Val Ser Asp Met 
625 630 635 640 



Asn Glu Glu Lys Lys lie lie Val Phe Glu Arg Gly Asp Leu Val Phe 

35 645 650 655 

Val Phe Asn Phe His Pro Ser Lys Thr Tyr Asp Gly Tyr Lys Val Gly 

660 665 670 

40 Cys Asp Leu Pro Gly Lys Tyr Lys Val Ala Leu Asp Ser Asp Ala Leu 

675 680 685 



Met Phe Gly Gly His Gly Arg Val Ala Gin Tyr Asn Asp His Phe Thr 
690 695 700 

Ser Pro Glu Gly Val Pro Gly Val Pro Glu Thr Asn Phe Asn Asn Arg 
705 710 715 720 



Pro Asn Ser Phe Lys Val Leu Ser Pro Pro Arg Thr Cys Val Ala Tyr 

50 725 730 735 

Tyr Arg Val Glu Glu Lys Ala Glu Lys Pro Lys Asp Glu Gly Ala Ala 
740 745 750 

55 Ser Trp Gly Lys Ala Ala Pro Gly Tyr He Asp Val Glu Ala Thr Arg 

755 760 765 



Val Lys Asp Ala Ala Asp Gly Glu Ala Thr Ser Gly Ser Lys Lys Ala 
770 775 780 

Ser Thr Gly Gly Asp Ser Ser Lys Lys Gly He Asn Phe Val Phe Gly 

785 790 795 800 
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Ser Pro Asp Lys Asp Asn Lys 
805 



(2) INFORMATION FOR SEQ ID NO: 7: 
5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

10 

(ii) MOLECULE TYPE: cDNA 



(iii) HYPOTHETICAL: NO 



15 (iv) ANTI-SENSE: 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: triticum tauschii 
(F) TISSUE TYPE: Endosperm 
20 - ■ 

(ix) FEATURE: 

(A) NAME/KEY: misc.signal 

(B) LOCATION: 1.3 1 9 

(D) OTHER INFORMATION:/function= "3' untranslated 
2 5 of wSBE I-D4 cDNA" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

GCGACTTCTG GTTCCAAAAA GGCGTCTACA GGAGGTGACT CCAGCAAGAA GGGAATTAAC 60 
TTTGTCTTCG GGTCACCTGA CAAAGATAAC AAATAAGCAC CATATCAACG CTTGATCAGA 12 0 
ACCGTGTACC GACGTCCTTG TAATATTCCT GCTATTGCTA GTAGTAGCAA TACTGTCAAA 180 

3 5 CTGTGCAGAC TTGAGATTCT GGCTTGGACT TTGCTGAGGT TACCTACTAT ATAGAAAGAT 2 40 

AAATAAGAGG TGATGGTGCG GGTCGAGTCC GGCTATATGT GCCAAATATG CGCCATCCCG 3 00 
AGTCCTCTGT CATAAAGGA 319 

40 

(2) INFORMATION FOR SEQ ID NO: 8: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4890 base pairs 

(B) TYPE: nucleic acid 

4 5 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 

50 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: 

(vi) ORIGINAL SOURCE: 
55 (A) ORGANISM: triticum tauschii 
(F) TISSUE TYPE: Endosperm 

(ix) FEATURE: 
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(A) NAME/KEY: promoter 

(B) LOCATION: I. .4890 

(D) OTHER INFORM ATION:/function= "promoter containing 
sequence of SBE I" 

5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

GGGTGGCGGG TCGGGCGGCA AGGCGCGGGG CGGCGGGGCG GCCGGGGCGG CGCGGCGGCG 60 
10 CGGGCGGCAG CGGCGGCTAG GGTTTCGCGG CGGCGGCGAC TTGGGCTGAG GCGGGGCACG 12 0 
GGCTGCGGCT TTAAAGGCCG GCCAGGCTGA GGTGTCCGGG TCGGACACGG CCCGTAAGGC 180 
GGTTGACTTT AAAAAATAAT AATTCGGACA TGCAAAAAAG TAAGAAAAGA AATAATAAAC 24 0 
GGACTCCAAA AATCCCGAAG TAAATTTTTC CCCATTCTTA AAAATAAGCC GGACAAGATG 3 00 
AACATTTATT TGGGCCTAAA ATGCAATTTT GAAAAATGCG TATTTTTCCT AATTCGGAAT 3 60 

2 0 AAAATCAAAT AAAATCCAAA TAAAATCAAA TATTTGTTTT TAATATTTTT CCTCCAATAT 42 0 
TTCATTATTT GTGAAGAAGT CATTTTATCC CATCTCATAT ATTTTGATAT GAAATATTTT 48 0 
CGGAGAGAAA AATAATTAAA ACAAATGATC CTATTTTCAA AATTTGAGAA AACCCAAATA 54 0 
TGAAAATAAC GAAATCCCCA ACTCTCTCCG TGGGTCCTTG AGTTGCGTGA AATTTCTAGG 600 
ATCACAAATC AAAATGCAAT AAAATATGAT ATGCATGATG ATCTAATGTA TAACATTCCA 660 

3 0 ATTGAAAATT TGGGATGTTA C AT AT AAC TC AAATTCTATA ATTATGAACA CAGAAATATT 720 
AATGTAGAAC TCTATTTTGT TTTGAAATTG TATTATTTTT TAGAATTAGT CTAGAGCATT 7 80 
TCGTGAACTT GAATCAAACC TTTAAATAAA ACAAAGCATA AAAATGACAA ATTCACATAT 840 
GAAATAACTT GTGTTACATA GATTTATTAC AATAGCGTTG TATGTGTGTA TGTGTGCGTG 9 00 
AGTGCCTATG GTAATATCAA TAAATATCTT GATAGATGTT TCTACAATTC ACGGGTCTAA 960 

40 CTAGTAATGC AATGCAATGC ATGCTAAAAG AATAGAACCT TAGTTTCATT TAACTAACAA 102 0 
TTTTCAAATG TATGAGTTGC CAACAAGTGG CATACTTGGC ACTGTTTGTT TGTTCATTTT 1080 
ATGGAAAGTT CTTCTCTTTT TACATGGTTT AGATTCCAGC ATGTAGCCAC AAAATATGAT 114 0 
TGTCAAAAGA TAATACCTCA TAATACAATT CCACTAAAGT CACCTAGCCC AAGTGACCGA 12 00 
CCTGATCCTG AAATAAAATC AGAAGATTTG GTGTCATCAT CATGACAACA AATTATTAGG 12 60 
50 CGGTAGATCT TGTGGTAGTA CTCATGATGT AAAATTATCA AGAGGGAGAG AATGTATGGA 13 2 0 
GATTTATGTG AAGTACATCG TACACCAGAC ATAGTTGACA CATCGATTTT TTAAGATACA 13 80 
TTTGGACGCG CCTTGTGGGA GTGTAAAGTA ' CTACCATGTA TTAGAAGAGG TGAAATGAGA 1440 
AATGCCATAG CTAGCAAGTA GGCCTAGTTA AGGAAATTCT TCCTTAGATC CCCTTCTCCC 1500 
GAAGAGTGAA GTGCTTCAAC TAAAGGTTAG ACCCACTTAA AAAATGTCAC TTTGAATCTT 1560 
60 TGCTTCCCTT GTCGTAATCC TGTGCATTTG TAGGTCCCTC GGATCTGAGC CCTTTCTCCA 16 20 
AGCCCTTCAT TGGATTCCCC TGGATGTCTT TTTGTTACAT TTTATTGAAG TGAGAGTGAA 16 8 0 
TTATTATATG CCCATAGGAG GTGGGATATA AAGGCTGTTG GTATTCTGCA CCATACATGC 17 40 

65 



35 



45 



55 



BNSDOCID: <WO 9914314A1_I_> 



WO 99/14314 PCT/AU98/00743 

- 70 - 

TAGAGTAGGG AGGAGAGGCT GGTGCATGAT ACATGGTGGA CTAGCCCATA TATTTACCCC 1800 
TCCCCCACCC ACTAACAAGT TTTTTTTATT AGGTCTTCAT CCTCTGATTT GTTTTTCTGT 1860 
5 TAGCCCATTC TTCATCATGG ACTTATTAAT CATGATTAGT TTCTTGGATT TTTGTTTACT 1920 
TGACTTGAAT TTGACAATGT GCCTCATATA TGGCATGTGG GACTGATAGG AAGATATATT 1980 
CTCACAACAT TAACTTAAAA AGGATTATTT TTTTGGTGCA GTCGTAAAGA AAACTACTTT 2040 
CTTTTATGCT AAAAGTTATT CAAACATAGA TTTATAAACA AAGGATATCA CCATGCATGA 2100 
CCATGCGCTC TCTCATGTTT ACTCTAGAAA CCATATATCT CTTTGTTGCA AAATATTTAA 2160 

15 TCTATCCTCC TTGTTTCTGG GAATGAGTCG GGGAAGGTAA TCTTAGGGAA GGTTAAAGTG 2220 
AGGCAAGTAA GAGCAACTCT AGCAGAGTCG CGATATGCCC AATCGCCATA ATGCCAATAT 2280 

2Q GGCATTTTTG GCCCAAAATG GCACTTCAGA AGAGTCACCA TATCCCTTCG GATAGCCATA 2340 
ATTTAGGGAG CTCGCTCCAC AAACAAGCTT CGAGCCTCCA AATATGGAGG CCATGGATTC 2400 
GTTGTTTGGC ACTCACTCCA TATCCAACCG CAAGCGCATG CATGAGGGAA GTTTTAGCTT 2460 

2 5 CTTCCTCCTT GCGCCAACGC CGGGATTTTA CACAGCGCAT TACAGGTACA TGAACCAGCA 2 520 

TGCACAGATA ATCACCGACG AGTGGGGTGA CAAGAAGGAT AAGCACCCTC CCATTAGTGG 2580 
3q TGCGCCCACT CCCCTCAAAT TCATGAGGCA GCCATTTGGA TGGTCATCGC GTGGCATAAG 2640 
CTCCGACTAT AAAATCTCAA CGGCATCACC AAAACCATAG CTGCCGCCTC CCCCTTCCTC 2700 
GGCATCACCT CCCCAAGACA. TCTCCTCCCC TCTATGCCAC AATGTCATCA TTATGGAGAG 2760 

3 5 ACACAACTAC TGGTAAACCG CATACCCAAT CATGGTTTAC CGGCAGTGCG AACCCCACCT 2820 

TCCTCCCACG ATGGTAGGAT ATTCTCCTCC TAGAATGGCG CGTGTGGCGC TTCCTCCTCC 2880 
4q CGAGGCTGAT ATGTCGGCTC CCATGATGGC GTGCATCATT GATTTGGCGC TTCGGGTCCA 2940 

TCATACATGT TAACGAGGTC ATCCCCATTG ATGTCGTTGG TCCCCTTGCC CCCCAGTCGG 3000 

ATCCTGAGGA CCCGTTCGAT GTCGCAATGC GACTCTCCAA ACTCAAAGCT CACAATGAGG 3 060 
45 AGTACGTCCT CTAGGAGTTC CGCCCCGCAA CCATCTATAA GGAGGAGCAA CGATAGCTCT 3120 

CCCCTACGCC TTCCTCGACG ATCTCTCTTA GGAGGACAAC GGCTAGACGA CGGCGGCGGC 3180 
5q GGCGAAGGTA CTGCAGGTAG TAGAACATAG CAATGTCGAA TGGCGACATT GCATATTTTG 3240 

AAAATGTCGC TCAACGACTT TTGAAGTCGC AAATAAAATG TAGTGTGACT ACTTTTGGCC 3300 

AGCAATATAA GTTTATCACA TTTGATAATG ATTTGAACCG GTGTGGTTCA ACTAAATGTA 3360 
55 CCATAAATTG AACATACAAA TTTTTAGCAA ATGAAAAAAG AAACAAGTAA GACCACAAAT 3420 

ATGAAAGCCG CATATCGCGA CTATGTGTTT GAGCCGCAGC TGCCAAGTAC ATATGAAGCG 3 480 
6q TACTCCATAT GACATACGAC AACCATACAT ATGAAGACTC TACTAGAGTT CTCTAAGGCC 3540 

GCTTTTAGCG CCTTTCGTGC AGTGGTGCCC ATAGGGAGTG AGGGTAGTTG GACTGTTCGT 3600 

TTCCCCTTTT TTCATTTCTT TGAAATCTAT TTTATTTTTT TTCTCTTTTG TAGGTTTCCC 3660 
65 AAATTTATAT ACCATTTTTC TGTTTCTCGC T ATTTTTTG T TGTTATATTC TAGTTTCATA 3720 

TTTTTCTATT ATTAATTTGT GTCTCTTATG AGAAGTCCAG ACTTGCATAT GGAGGTGCAC 3780 
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ACACAAACAT ATAAAGTATA AATACTAACT TGAGAAGTAT GTTTGCGTGG TCAAAAAAAC 3 84 0 
ATCATCAAAA CCTGCCAATA TGAGATATAG TTTTGAATAT ATCAATATGA GCAACGCAAC 3 9 00 
^ CATTTAAAAT GTGAACAATT GTTTTTTTAG AAAAAATATA AGAAATAACT CCAACCCAGC 3 9 60 
CAAACCACAT GCTATACACT TGCTCCATAT GAAACCATGT TTGCTATTGG GCAGTTGCCT 402 0 

10 GAAACCGAAA GTAATGTTAG CCGTTTTTCT ATTCAAAGAA GAAGGAGAGT CGAGGTGACG 408 0 
CGATGCTTAG ACGTGAGATG GGGATGACCA CAACGTCCCT AC AG AG AC CT CACCGGAGAT 4140 
GGGGACATTG CAGTTGACAC GAGAGCGGTG AGGGGCTGCG ATGCGTGTGC GGCAACATGT 42 00 

15 GGCGAGGCGG ACGTCGGGCT GGCAGGTAGG GGGGAGGGGG AAGGACCGGG GGAGGAAGAA 42 60 
GAGGAGTAGC CTGCAAAACA TGGTACACCA GTTTTCTGCC CTACGAAAAC CTCATTTCAT 4320 

2 0 TCCCCCACCC TGACAAGCAA CAACCAACCA TCGCAGTCCC ACATGTCCCT CTGGTCTTTG 4380 

CAAAAAGTAA TTGTTCTTGC TGGACAGCGC AAAGAGTAAA CTTTTGTTAG TTTTCATTTC 4 440 
TAGAAAAAGC AATCCTTTTA TAGTTCTTTT GTGAAAGTAA TGCTTTTATA GTGATTGGGA 45 00 

25 

TGTTCTTTTA GAGCAAATAT CTTCTTTTTT TTTTAGGGAA AAGAGCAAAT ATCTTCCACT 4 56 0 
TTTCACAAAA CTGACGAAGG CTGAAAGTGG CGAGACAGTG AGGGCCCATA GCTTTCGTCC 4 620 

3 0 GGCCCAGCGG CGCACGACCG TCCACGTGCA CCCCGGCCCT CCCGGGCCCG CAGATCCGTT 46 80 

CTCCCTCGCC CCCGTTTCCC CCTCCCTCCC TCTCGTTGCT TCCACTCCAC TGTTCTCCTC 47 40 
TTCCTGTCCA AAGCGGCCAC GGACCGGAAA AAAATCACGC CTTTCCGTTG GGTCTCCGGC 4 8 00 
GCCACACTCC TCCTCCGGCC GATATAAAGC GCGCGGGGCC ACGGGCCCGG CGCAAAATGG 48 60 
GATTCCCGTC CGCCGCCATG GAGGAAGATG 489 0 

4 0 (2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6228 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
45 (D) TOPOLOGY: linear 

(li) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

50 

(iv) ANTI-SENSE: 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: trilicum tauschii 
55 (F) TISSUE TYPE: Endosperm 

(ix) FEATURE: 

(A) NAME/KEY: misc.feature 

(B) LOCATION:! 

6 0 (D) OTHER INFORMATION :/product= "coding region of wSBE I-D4 gene" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 



f 
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ACGGGCCCGG CGCAAAATGG GATTCCCGTC CGCCGCCATC GACGAAGATG CTCTGCCTCA 60 
CCGCCCCCTC CTGCTCGCCA TCTCTCCCGC CGCGCCCCTC CCGTCCCGCT GCTGACCGGC 120 
5 CCGGACCGGG GATCTCGGTG AGTCAGTCGG GATCTTCATT TCTTTTCTTT TCTTTCGTTT 180 
CCGGCTCCGT TCTGCCGGGG TTTCCCTGAT GCGATGCCGC GCGCGCGCAG GGCGGCGGCA 240 
ATGTGCGGCT GAGCGCGGTG CCCGCGCCCT CTTCGCTCCG CTGGTCGTGG CCGCGGAAGG 300 
TGAGCCCTCT CCCCTGTCTA CCCAGATTTG CGACCGTGAT CCCCTGTTGT CGCCGGGCAA 360 
ACGGAATCTG ATCCACGGTG GTTATTGGAA ATAGTATATA CTACTAATAA ACTTGAGGCT 420 
15 GGGATTCGTC CACTGAGGAA CAAGTGGATG CGATTTCGAT TGGATTTCTC TGCTTTATGC 480 
GATCCGTACG CAGAATATCC CTCCTGCAGT GTCTCAACCG TATTACTGGA TGTACAACCC 540 
2q AAATGTGTAT AATCTGTGCT GAATGTATCA ACCAATAATT GCTGCATTGT GAAAACATAA 600 
TCCTGTGTTG TGTCTCTACT ACTTGTTCAG TCCTGATCTG CCGCTTATCC TAACTTTTGT 660 
TCATTTATGG AAGGCCAAGA GCAAGTTCTC TGTTCCCGTG TCTGCGCCAA GAGACTACAC 720 
25 CATGGCAACA GCTGAAGATG GTGTTGGCGA CCTTCCGATA TACGATCTGG ATCCGAAGTT 780 
TGCCGGCTTC AAGGAACACT TCAGTTATAG GATGAAAAAG TACCTTGACC AGAAACATTC 840 
3q GATTGAGAAG CACGAGGGAG GCCTTGAAGA GTTCTCTAAA GGTTAGCTTT TGTTTCATGT 900 
GTTTGAAACA ATAGTTACAT CTTGTGGCGT CCGCAGCACA AAAGACATAA TGCGACTCTG 960 
TTTTGTAGGC TATTTGAAGT TTGGGATCAA CACAGAAAAT GACGCAACTG TGTACCGGGA 1020 
3 5 ATGGGCCCCT GCAGCAATGT AAGTTCTAGT GTTGTCACGC AACTAATTGC AATGGTCGTT 1080 
GGTTAACTTA TGAAGTGCTG ATGAAACTGT CTTAAGAGTT TATGGCTTGT CTTTTCTGAT 1140 
TCTAGCTAGT AAAGAGTAGA TAAATATGAA ATATGTTTTC CCTTTTCTAG TTATGGTCAT 1200 
GGTTGGCTGG TATTCATTTC TTTTATGGCA ATACTTGCTT CTAACTATCT TTAGTAGATT 1260 
CATGTATTTA CTTGTGAGTC ATTACTTTAT GGGTGTAGGG ATGCACAACT TATTGGTGAC 1320 
45 TTCAACAACT GGAATGGCTC TGGGCACAGG ATGACAAAGG ATAATTATGG TGTTTGGTCA 13 80 
ATCAGGATTT CCCATGTCAA TGGGAAACCT GCCATCCCCC ATAATTCCAA GGTTAAATTT 1440 
5Q CGATTTCACC GTGGAGATGG ACTATGGGTC GATCGGGTTC CTGCATGGAT TCGTTATGCA 1500 
ACTTTTGATG CCTCTAAATT TGGAGCTCCA TATGACGGTG TTCACTGGGA TCCACCTTCT 1560 
GGTGAAAGGT CTACTTTTAG TGGCTCGAGA GCAAGAAATC TAAGTAAAAC CCACACAATT 1620 
55 AACTTACATT AATGTGGAGA CATGATACTT TTATTGCTCG TTTTGCAGGT ATGTGTTTAA 1680 

G CATCCTCGG CCTCGAAAGC CTGACG C TCC A CGTA TTTAC GAGGCTCATG TGQGGATGAG 1740 

go TGGTGAAAAG CCTGAAGTAA GCACATACAG AG AATTTGC A GACAATGTGT TACCGCGCAT 1800 
AAAGGCAAAC AACTACAACA CAGTTCAGCT GATGGCAATC ATGGAACATT CATATTATGC 1860 
TTCTTTTGGG TACCATGTGA CGAATTTCTT CGCAGTTAGC AGCAGATCAG AACGCCAGAG 19.20 
65 ACCTCAATAT CTTGTTGACA AGGCACATAG TTTACGGTTG CGTGTTCTGA TGGATGTTGT 1980 
CCATAGCCAT GCGAGCAGTA ATAAGACAGA TGGTCTTAAT GGCTATGATG TTGGGCAAAA 2040 
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CACACAGGAG TCCTATTTCC ACACAGGAGA AAGGGGCTAT CATAAACTGT GGGATAGCCG 210 0 
CCTGTTCAAC TATGCCAATT GGGAGTCTTA CGATTTCTTC TTTCTAATCT GAGATATTGG 2160 
5 ATGGACGAAT TCATGTTTGA TGGCTTCCGA TTTGATGGGG TAACATCCAT GCTATATAAT 222 0 
CACCATGGTA TCAATATGTC ATTCGCTGGA AGTTACAAGG AATATTTTGG TTTGGATACT 22 80 
10 GATGTAGATG CAGTTGTTTA CCTGATGCTT GCGAACCATT TAATGCACAA ACTCTTGCCA 2 3 40 
GAAGCAACTG TTGTTGCAGA AGATGTTTCA GGCATGCCAG TGCTTTGTCG GTC AGTTG AT 2400 
GAAGGTGGAG TAGGGTTTGA CTATCGCCTG GCTATGGCTA TTCCTGATAG ATGGATCGAC 2 460 

15 

TACTTGAAGA ACAAAGATGA CCTTGAATGG TCAATGAGTG GAATAGCACA TACTCTGACC 2 520 
AACAGGAGAT ATACGGAAAA GTGCATTGCA TATGCTGAGA GCCATGATCA GGTATGTTTT 25 80 
20 CCCTCCTTTG TCGCTGTGCG TGAGTATGTG TTCTTTTTTT ATGGGGCACT GGTCTAAGAA 2 640 
CATACAGTTC AAAGGTGAGA CACTTTCTTT GCCTGGTAGA CAAATTTGAG AAATAAACAT 27 00 
TTCGCTTGAT GACTTTTAGT TGCTTCACAA GTTCGAATTA AGTTAGTTAT ATTCTGATAA 27 60 

25 

CTAGTGATAG TACCCACTAA CCAGCTATTA CGGACCATGT AAGAATGTCC GAAGACTGCA 2 82 0 
GTTATATATC GTTGACTTTG TGTTCATCTA TTGAAACAAC TTAGTAGTTA ACTTTCACGC 2 8 80 

3 0 AAATTTTCAG TCTATTGTTG GCGACAAGAC TATGGCATTT CTCTTGATGG ACAAGGAAAT 29 4 0 
GTATACTGGC ATGTCAGACT TGCAGCCTGC TTCGCCTACA ATTGATCGTG GAATTGCACT 3 000 
TCAAAAGGTT CGATTCGTTT TAAGTATTCC TGAATTTGAT GTTCTAGTTC CAGACGAGTA 3 060 

35 TTGTAATGTT CGTTGTTACT CAGAGTTCTG CTTAGTCCTT GAAGATAATG TATTCCAGTC 312 0 
CCTTTTGGTA CATTTGGCTT ATTTTGTTAC AAATATTTCA GATGATTCAC TTCATCACCA 3180 

40 TGGCCCTTGG AGGTGATGGC TACTTGAATT TTATGGGTAA TGAGGTAATA TCTGGTTATC 32 40 
TGTCAAAACT TATTTCTGAT CAATATGTTT CGGGATTCCC TCGAAAAAAA TCCTTTGGGC 3 3 00 
AGGGCGAAAA GTTTAAACAT CTGTTTTCTA TGATAGCCAA GTACTCCCCA GCTATTTCCA 3 3 60 

45 

TGTTATCACG TATCATTTAG CTGTGCCGGT AGTTAATCTT TATTCTAATT CATTGTTGTT 3 42 0 
TTTTAGCGTG GCAGTCTATT GTTGGATCCT CTTATTCCAA TTACATATAT GCCGACATCA 3 480 

50 CACACTTATG AATATTCCCT GTTTAAAAGA TTTTTATTTT ATACCAATGT TTCTCCGTAA 3 54 0 
ATGATGCAAA CATGATAGAG ATGTTAGCAT GTCTTTCTTA ACCTACTCAT GTTTTACATA 3 600 
TCACGACAAG CTTCTTGCAG AAAATCAGCA GTATATGGCA AATTGCTGCA ACCTGACAAC 3 660 

55 GTTTATATCT GTTTTC T AAC TCATACTGAC GGTGCAATTT CCTTTTAGTT TGGCCACCCA 3720 
GAATGGATTG ACTTTCCAGA AGAAGGCAAC AACTGGAGTT ATGATAAATG CAGACGCCAG 3780 

60 TGGAGCCTCG CAGACATTGA TCACCTACGA TACAAGGTTA TGCCTATGTA TATTTTTACA 3 840 
GTTTCTGGTC TGGTAGCTCT CTTGGGATCT TGACCTCACT TAGTTCCTTC ATCTCTGACT 3 900 
GTAGCTTATT TACACTGTGT TCCAACTTCT GTCTTGTGGA TAAATTCTCC CTTCTAACGT 3 9 60 

65 TTCATATTAA GCCTTTCAAA CTAAACTAAA TTGCTGATCT ACTACTAGTT GCTCAGTACG 402 0 
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ATG AC C AAAT CTTGCCTGTG GTAACCTAGT AATTTTC TTG ATTCTTACAC ATTAGTGATA 4080 
TGCAGTGCAT ACATTATCCA TATAAATTGA CATTGCAATT TCCCAAATAT TATTTGAAGG 4140 
5 CTGTGTTCTT TTGTTAACAG GAAGTTATTT TCTCTGCATC TGATAAATAA TAATAGCCTT 4200 
TCACGATTTT TCTCATATTT TATCCAACTT TTCTGCATTC AAGCATTTTT TGTTTCTCGC 42 60 
CTAACATATA TAATTTGAAC AGTACATGAA CGCATTTGAT CAAGCAATGA ATGCGCTCGA 43 2 0 
CGACAAATTT TCCTTCCTAT CATCATCAAA GCAGATTGTC AGCGACATGA ATGAGGAAAA 43 80 
GAAGTAGTTA ACTATACAAT GTTTAGTCAG GGCAGCTGTT GCATCATTTG ATTCACTCCT 4440 
15 ACTCTTAAGA ATAGCAACTC TGACTTGTGC GTTTTATGTT ACCAAATAAG TTGAAACCGT 4 500 
ATCTGTTTGA TATGAACCAT TGTTGTCTCA AAATGGGCTA TGGACTCAAT CCAACTTCCT 4560 
TTCCAGATTA TTGTATTTGA ACGTGGAATC TGGTCTTCGT CTTCAATTTT CATCCCAGTA 4 62 0 
AAACTTATGA TGGGTAACTG ATCTCTTGCA AGCTTTGCCT TTCAATATTT CTTCTGCTTA 4680 
ATGACTAATG TGCTTAATCT CGTTTCCACT TTTAAAACAC GCAGTTACAA AGTCGGATGT 4740 

2 5 GACTTGCCTG GGAAGTACAA GGTAGCTCTG GACTCTGATG CTCTGATGTT TGGTGGACAT 4800 

GGAAGAGTAA GCAATGTTAA TGATGTTCAA GATCTGTTTT GCAACACTAT GTTCTTCTAT 4860 
AGAAGGGGCC ATCAAGGCTG CATCAGATAA TCTTATTTGC AGTGTTGATC TGTGCTGCAT 492 0 
CGCAGGTGGC CCATGACAAC GATCACTTTA CGTCACCTGA AGGAGTACCA GG AG TACCTG 49 80 
AAACAAACTT CAACAACCGC CCTAACTCAT TCAAAATCCT GTCTCCATCC CGCACTTGTG 5040 

3 5 TGGTAATGCT AATTACTAGG AGGATTTAGT AACAATAAAT AAATAACAGC AAAAGATATC 5100 

TGCAGTACGA TCTCACAAAA TGCTCTCTTG CCAGGCTTAC TATCGCGTCG AGGAGAAAGC 5160 
^ o GGAAAAGCCC AAGGATGAAG GAGCTGCTTT CTTGGGGGAA ACTGCTCTCG GGTACATCGA 522 0 
TGTTGAAGCC ACTGGCGTCA AAGACGCAGC AGATGGTGAG GCGACTTCTG GTTCCGAAAA 5280 
GGCGTCTACA GGAGGTGACT CCAGCAAGAA GGGAATTAAC TTTGTCTTTC TGTCACCCGA 53 40 
45 CAAAGACAAC AAATAAGCAC CATATCAACG CTTGATCAGG ACCGTGTGCC GACGTCCTTG 5400 
TAATACTCCT GCTATTGCTA GTAGTAGCAA TACTGTCAAA CTGTGCAGAC TTGAAATTCT 5460 
GGCTTGGACT TTGCTGAGGT TACCTACTAT ATAGAAAGAT AAATAAGCGG TGATGGTGCG 552 0 
GGTCGAGTCC AGCTATATGT GCCAAATATG CGCCATCCCG AGTCCTCTGT CATAAAGAAA 558 0 
GTTTCGGGCT TCCATCCCAG AATAAAAACA GTTGTCTGTT TGCAATTTCT TTTTGTCTTG 5640 
55 CATAGTTACA TGATAATTGA TGCATATTGC TATAAGCCTG GATTGCATCT TCTTTTGCTA 5700 
A TAACTG CAG GGCCAAGAAA GCCTAGATTG TATCTTTTTT TG CTAATAAC TGCAGTGPTP, S7fin 



30 



60 



GGGAAGCTTC AGTCCTTGTT TCCGTTCTCG AGACAAGGCG TCATGTTTGG CGCACAAAGG 5820 
TAAGCCATCA TCTTATCAAG TCCCAAAATT CTCTGGTTGA AAGAAACCAT CACTAACTTG 5880 
TTCCAGGTGT TGGTTCCTCC ACAACCAAAA GGCGACCATC GTCGTCATCA TCGCTCACAG 594 0 
65 CACTGACCAT CGAAGCCACG GTGGGCATGA AATGCGCATC GCCCAAGACT TGGGACCGTT 6000 
TCAAAATATC ACAAACTGCC ATGGCATCTT CTGCCAAAGG CTGCACTGCA CCTTTGGCAT 6060 
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GAACAGAAGC AACAGGGGCT TGGAACTGAA CGCCGAAAAT AAAGTCAAAC CGGCTGGGCC 6120 
GGATTGAAAG GGGAAACGCC AAAATCCACT TAATTTGAAT GGAAGGAGGA ATGGTTCTTG 6180 

5 

CTGGTTTCAA CTCTGCAGGC TTCCCTCTGA ATTTCACACG GAGCCATT 622 8 

(2) INFORMATION FOR SEQ ID NO: 10: 
(i) SEQUENCE CHARACTERISTICS: 
1 0 (A) LENGTH: 1 1463 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

1 5 (ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: 

20 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: triticum tauschii 
(F) TISSUE TYPE: Endosperm 

2 5 (ix) FEATURE: 

(A) NAME/KEY: misc.feature 

(B) LOCATION: 1.. 11463 

(D) OTHER INFORMATION:/product= "complete sequence of the 
starch branching enzyme II gene" 

30 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

AGAAACACCT CCATTTTAGA TTTTTTTTTT GTTCTTTTCG GACGGTGGGT CGTGGAGAGA 60 

3 5 " TTAGCGTCTA GTTTTCTTAA AAGAACAGGC CATTTAGGCC CTGCTTTACA AAAGGCTCAA 12 0 

CCAGTCCAAA ACGTCTGCTA GGATCACCAG CTGCAAAGTT AAGCGC GAGA CCACCAAAAC 180 

AGGCGCATTC GAACTGGACA GACGCTCACG CAGGAGCCCA GCACCACAGG CTTGAGCCTG 2 40 

40 

ACAGCGGACG TGAGTGCGTG ACACATGGGG TCATCTATGG GCGTCGGAGC AAGGAAGAGA 3 00 

GACGCACATG AACACCATGA TGATGCTATC AGGCCTGATG GAGGGAGCAA CCATGCACCT 3 60 

45 TTTCCCCTCT GGAAATTCAT AGCTCACACT TTTTTTTAAT GGAAGCAAGA GTTGGCAAAC 420 

ACATGCATTT TCAAACAAGG AAAATTAATT CTCAAACCAC CATGACATGC AATTCTCAAA 4 80 

CCATGCACCG ACGAGTCCAT GCGAGGTGGA AACGAAGAAC TGAAAATCAA CATCCCAGTT 540 

50 

GTCGAGTCGA GAAGAGGATG ACACTGAAAG TATGCGTATT ACGATTTCAT TTACATACAT 6 00 

GTACAAATAC ATAATGTACC CTACAATTTG TTTTTTGGAG CAGAGTGGTG TGGTCTTTTT 660 

55 TTTTTACACG AAAATGCCAT AGCTGGCCCG CATGCGTGCA GATCGGATGA TCGGTCGGAG 72 0 

ACGACGGACA ATCAGACACT CACCAACTGC TTTTGTCTGG GACACAATAA ATGTTTTTGT 7 80 

AAACAAAATA AATACTTATA AACGAGGGTA CTAGAGGCCG CTAACGGCAT GGCCAGGTAA 8 40 

60 

ACGCGCTCCC AGCCGTTGGT TTGCGATCTC GTCCTCCCGC ACGCAGCGTC GCCTCCACCG 900 
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TCCGTCCGTC GCTGCCACCT CTGCTGTGCG CGCGCACGAA GGG AGGAAG A ACGAACGCCG 9 60 
CACACACACT CACACACGGC ACACTCCCCG TGGGTCCCCT TTCCGGCTTG GCGTCTATCT 1020 
5 CCTCTCCCCC GCCCATCCCC ATGCACTGCA CCGTACCCGC CAGCTTCCAC CCCCGCCGCA 1080 
CACGTTGCTC CCCCTTCTCA TCGCTTCTCA ATTAATATCT CCATCACTCG GGTTCCGCGC 1140 
TGCATTTCGG CCGGCGGGTT GAGTGAGATC TGGGCGACTG GCTGACTCAA TCACTACGCG 12 00 
GGGATGGCGA CGTTCGCGGT GTCCGGCGCG ACTCTCGGTG TGGCGCGGGC CGGCGTCGGA 12 60 
GTGGCGCGGG CCGGCTCGGA GCGGAGGGGC GGGGCGGACT TGCCGTCGCT GCTCCTCAGG 13 20 

15 AAGAAGGACT CCTCTCGTAC GCCTCGCTCT CTCGAATCTC CCCCGTCTGG CTTTGGCTCC 13 80 
CCTTCTCTCT CCTCTGCGCG CGCATGGCCT GTTCGATGCT GTTCCCCAAT TGATCTCCAT 1440 
GAGTGAGAGA GATAGCTGGA TTAGGCGATC GCGCTTCCTG AACCTGTATT TTTTCCCCCG 1500 
CGGGGAAATG CGTTAGTGTC ACCCAGGCCC TGGTGTTACC ACGGCTTTGA TCATTCCTCG 15 60 
TTTCATTCTG ATATATATTT TCTCATTCTT TTTCTTCCTG TTCTTGCTGT AACTGCAAGT 1620 

25 TGTGGCGTTT TTTCACTATT GTAGTCATCC TTGCATTTTG CAGGCGCCGT CCTGAGCCGC 16 80 
GCGGCCTCTC CAGGGAAGGT CCTGGTGCCT GACGGCGAGA GGACGACTTG GCAAGTCCGG 1740 
CGCAACCTGA AGAATTACAG G T AC AC AC AC TCGTGCCGGT AAATCTTCAT ACAATCGTTA 18 00 
TTCACTTACC AAATGCCGGA TGAAACCAAC CACGGATGCG TCAGGTTTCG AGCTTCTTCT 1860 
ATCAGCATTG TGCAGTACTG CACTGCCTTG TTCATTTTGT TAGCCTTGGC CCCGTGCTGG 1920 

3 5 CTCTTGGGCC ACTGAAAAAA TCAGATGGAT GTGCATTCTA GCAAGAACTT CACAACATAA 1980 
TGCACCGTTT GGGGTTTCGT CAGTCTGCTC TACAATTGCT ATTTTTCGTG CTGTAGATAC 2 040 

4Q CTGAAGATAT CGAGGAGCAA ACGGCGGAAG TGAACATGAC AGGGGGGACT GCAGAGAAAC 2100 
TTCAATCTTC AGAACCGACT CAGGGCATTG TGGAAACAAT CACTGATGGT GTAACCAAAG 2160 
GAGTTAAGGA ACTAGTCGTG GGGGAGAAAC CGCGAGTTGT CCCAAAACCA GGAGATGGGC 2220 

45 AGAAAATATA CGAGATTGAC CCAACACTGA AAGATTTTCG GAGCCATCTT GACTACCGGT 22 80 
AATGCCTACC CGCTGCTTTC GCTCATTTTG AATTAAGGTC CTTTCATCAT GCAAATTTGG 2340 
GGAACATCAA AGAGACAAAG ACTAGGGACC ACCATTTCAT ACAGATCCCT TCGTGGTCTG 2400 
AGAATATGCT GGG AAGTAAA TGTATAATTG ATGGCTACAA TTTGCTCAAA ATTGCAATAC 2 4 60 
GAATAACTGT CTCCGATCAT TACAATTAAA GAGTGGCAAA CTGATGAAAA TGTGGTGGAT 2 520 

55 GGGTTATAGA TTTTACTTTG CTAATTCCTC TACCAAATTC CTAGGGGGGA AATCTACCAG 2 5 80 
TTGGGAAACT TAGTTTCTTA TCTTTGTGGC C TTTTTGTTT TGGGGAAAAC ACATTGCTAA 2 640 



60 



ATTCGAATGA TTTTGGGTAT ACCTCGGTGG ATTCAACAGA TACAGCGAAT ACAAGAGAAT 27 00 

TCGTGCTGCT ATTGACCAAC ATGAAGGTGG ATTGGAAGCA TTTTCTCGTG GTTATGAAAA 27 60 

GCTTGGATTT -ACCCGCAGGT AAATTTAAAG CTTTATTATT ATGAAACGCC TCCACTAGTC 2 820 

65 TAATTGCATA TCTTATAAGA AAATTTATAA TTCCTGTTTT CCCCTCTCTT TTTTCCAGTG 2880 

CTGAAGGTAT CGTCTAATTG CATATCTTAT AAGAAAATTT ATATTCCTGT TTTCCCC TAT 2940 
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TTTCCAGTGC TGAAGGTATC ACTTACCGAG AATGGGCTCC CTGGAGCGCA TGTTATGTTC 3000 
TTTTAAGTTC CTTAACGAGA CACCTTCCAA TTTATTGTTA ATGGTCACTA TTCACCAACT 3 060 

5 

AGCTTACTGG ACTTACAAAT TAGCTTACTG AATACTGACC AGTT AC TATA AATTTATGAT 3120 
CTGGCTTTTG CACCCTGTTA CAGTCTGCAG CATTAGTAGG TGACTTCAAC AATTGGAATC 3180 
10 CAAATGCAGA TACTATGACC AGAGTATGTC TACAGCTTGG CAATTTTCCA CCTTTGCTTC 3 2 40 
ATAACTACTG ATACATCTAT TTGTATTTAT TTAGCTGTTT GCACATTCCT TAAAGTTGAG 3 3 00 
CCTCAACTAC ATCATATCAA AATGGTATAA TTTGTCAGTG TCTTAAGCTT CAGCCCAAAG 3 3 60 

15 

ATTCTACTGA ATTTAGTCCA TCTTTTTGAG ATTGAAAATG AGTATATTAA GGATGAATGA 3 42 0 
ATACGTGCAA CACTCCCATC TGCATTATGT GTGCTTTTCC ATCTACAATG AGCATATTTC 3 4 80 

2 0 CATGCTATCA GTGAAGGTTT GCTCCTATTG ATGCAGATAT TTGATATGGT CTTTTCAGGA 3 540 

TGATTATGGT GTTTGGGAGA TTTTCCTCCC TAACAACGCT GATGGATCCT CAGCTATTCC 3 600 
TCATGGCTCA CGTGTAAAGG TAAGCTGGCC AATTATTTAG TCGAGGATGT AGCATTTTCG 3 660 

25 

AACTCTGCCT ACTAAGGGTC CCTTTTCCTC TCTGTTTTTT AGATACGGAT GGATACTCCA 3 7 20 
TCCGGTGTGA AGGATTCAAT TTCTGCTTGG ATCAAGTTCT CTGTGCAGGC TCCAGGTGAA 3 780 

3 0 AT AC C TTTC A ATGGCATATA TTATGATCCA CCTGAAGAGG TAAGTATCGA TCTACATTAC 3 840 

ATTATTAAAT GAAATTTCCA GTGTTACAGT TTTTTAATAC CCACTTCTTA CTGACATGTG 3 9 00 
AGTCAAGACA ATACTTTTGA ATTTGGAAGT GACATATGCA TTAATTCACC TTCTAAGGGC 3 960 

35 

TAAGGGGCAA CCAACCTTGG TGATGTGTGT ATGC TTGTGT GTGACATAAG ATCTTATAGC 4 020 
TCTTTTATGT GTTCTCTGTT GGTTAGGATA TTCCATTTTG GCCTTTTGTG ACCATTTACT 4 080 

4 0 AAGGATATTT ACATGCAAAT GCAGGAGAAG TATGTCTTCC AACATCTCAA CTAAACGACC 4140 

AGAGTCACTA AGGATTTATG AATCACACAT TGGAATGAGC AGCCCGGTAT GTCAATAAGT 42 00 
TATTTCACCT GTTTCTGGTC TGATGGTTTA TTCTATGGAT TTTCTAGTTC TGTTATGTAC 42 60 

45 

TGTTAACATA TTACATGGTG CATTCACTTG ACAACCTCGA TTTTATTTTC T AATGTC TTC 43 2 0 
ATATTGGCAA GTGCAAAACT TTGCTTCCTC TTTGTCTGCT TGTTCTTTTG TCTTCTGTAA 43 80 
50 GATTTCCATT GCATTTGGAG GCAGTGGGCA TGTGAAAGTC ATATCTATTT TTTTTTTGTC 4440 
AGAGCATAGT TATATGAATT CCATTGTTGT TGCAATAGCT CGGTATAATG TAACCATGTT 4 5 00 
ACTAGCTTAA GATTTCCCAC TTAGGATGTA AGAAATATTG CATTGGAGCG TCTCCAGCAA 4 560 

55 

GCCATTTCCT ACCTTATTAA TGAGAGAGAG ACAAGGGGGG GGGGGGGGGG GGGGTTCCCT 4 62 0 
TCATTATTCT GCGAGCGATT C AAAAAC TTC CATTGTTCTG AGGTGTACGT ACTGCAGGGA 4 680 
60 TCTCCCATTA TGAAGAGGAT ATAGTTAATT CTTTGTAACC TACTTGGAAA CTTGAGTCTT 47 40 
GAGGCATCGC TAATATATAC TATCATCACA ATACTTAGAG GATGCATCTG AAATTTTAGT 4 800 
GTGATCTTGC ACAGGAACCG AAGATAAATT CATATGCTAA TTTTAGGGAT GAGGTGTTGC 4 860 

65 

CAAGAATTAA AAGGCTTGGA TACAATGCAG TGCAGATAAT GGCAATCCAG GAGCATTCAT 49 20 
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ACTATGCAAG CTTTGGGTAT TCACACAATC CATTTTTTTC TGTATACACT CTTCACCCAT 4980 
TTGGAGCTAT TACATCCTAA TGCTTCATGC ACATAAAATA TTTGGATATA ATCCTTTATT 504 0 
5 AGATATATAG TACAACTACA CTTAGTATTC TGAAAAAGAT CATTTTATTG TTGTTGGCTT 5100 
GTTCCAGGTA CCATGTTACT AATTTTTTTG CACCAAGTAG CCGTTTTGGA ACTCCAGAGG 5160 
ACTTAAAATC CTTGATCGAT AGAGCACATG AGCTTGGTTT GCTTGTTCTT ATGGATATTG 522 0 
TTCATAGGTA ATTAGTCCAA TTTAATTTTA GCTGTTTTAC TGTTTATCTG GTATTCTAAA 52 80 
GGGAAATTCA GGCAATTATG ATACATTGTC AAAAGCTAAG AGTGGCGAAA GTGAAATGTC 53 40 

15 AAAATCTAGA GTGGCATAAG GAAAATTGGC AAAAAC TAG A GTGGCAAAAA TAAAATTTTC 5400 
CCATCCTAAA TGGCAGGGCC CTATCGCCGA ATATTTTTCC ATTCTATATA ATTGTGCTAC 5460 

^ GTGACTTCTT TTTTCTCAGA TGTATTAAAC CAGTTGGACA TGAAATGTAT TTGGTACATG 552 0 
TAGTAAACTG ACAGTTCCAT AGAATATCGT TTTGTAATGG CAACACAATT TGATGCCATA 558 0 
GATGTGGATT GAGAAGTTCA GATGCTATCA ATAGAATTAA TCAACTGGCC ATGTACTCGT 5640 

2 5 GGCACTACAT ATAGTTTGCA AGTTGGAAAA CTGACAGCAA TACCTCACTG ATAAGTGGCC 5700 

AGGCCCCACT TGCCAGCTTC ATACTAGATG TTACTTCCCT GTTGAATTCA TTTGAACATA 57 6 0 
TTACTTAAAG TTCTTCATTT GTCCTAAGTC AAACTTCTTT AAGTTTGACC AAGTCTATTG 5820 
GAAAATATAT CAACATCTAC AACACCAAAT TACTTTGATC AGATTAACAA TTTTTATTTT 5 88 0 
ATTATATTAG CACATCTTTG ATGTTGTAGA TATCAGCACA TTTTTCTATA GACTTGGTCA 594 0 

3 5 AATATAGAGA AGTTTGACTT AGG AC AAATC TAGAACTTCA ATCAATTTGG ATCAGAGGGA 6000 

ACATCAAATA ATATAGATAG ATGTCAACAC TTCAACAAAA AAATCAGACC TTGTCACCAT 6060 
^ ATATGCATCA GACCATCTGT TTGCTTTAGC CACTTGCTTT CATATTTATG TGTTTGTACC 612 0 

TAATCTACTT TTCCTTCTAC TTGGTTTGGT TGATTCTATT TCAGTTGCAT TGCTTCATCA 6180 

ATGATTTTGT GTACCCTGCA GTCATTCGTC AAATAATACC CTTGACGGTT TGAATGGTTT 62 40 
45 CGATGGCACT GATACACATT ACTTCCACGG TGGTCCACGC GGCCATCATT GGATGTGGGA 63 00 

TTCTCGTCTA TTCAACTATG GGAGTTGGGA AGTATGTAGC TCTGACTTCT GTCACCATAT 63 6 0 
5 ^ TTGGCTAACT GTTCCTGTTA ATCTGTTCTT ACACATGTTG ATATTCTATT CTTATGCAGG 642 0 

TATTGAGATT CTTACTGTCA AACGCGAGAT GGTGGCTTGA AGAATATAAG TTTGATGGAT 6480 

TTCGATTTGA TGGGGTGACC TCCATGATGT ATACTCACCA TGGATTACAA GTAAGTCATC 6540 
55 AAGTGGTTTC AGTAACTTTT TTAGGGCACT GAAACAATTG CTATGCATCA TAACATGTAT 6600 

CATGATCAGG ACTTGTGCTA CGGAGTCTTA GATAGTTCCC TAGTATGCTT GTACAATTTT 6660 



60 



ACCTGATGAG ATCATGGAAG ATTGGAAGTG ATTATTATTT ATTTTCTTTC TAAGTTTGTT 6720 

TCTTGTTCTA GATGACATTT ACTGGGAACT ATGGCGAATA TTTTGGATTT GCTACTGATG 67 80 

TTGATGCGGT AGTTTACTTG ATGCTGGTCA ACGATCTAAT TCATGGACTT TATCCTGATG 6840 

65 CTGTATCCAT TGGTGAAGAT GTAAGTGCTT ACAGTATTTA TGATTTTTAA CTAGTTAAGT 6900 

AGTTTTATTT TGGGGATCAG TCTGTTACAC TTTTTGTTAG GGGTAAAATC TCTCTTTTCA 69 60 
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TAACAATGCT AATTTATACC TTGTATGATA ATGCATCACT TAGTAATTTG AAAAGTGCAA 7 02 0 
GGGCATTCAA GCTTACGAGC ATATTTTTTG ATGGCTGTAA TTTATTTGAT AGTATGCTTG 708 0 
5 TTTGGGTTTT TCAATAAGTG GGAGTGTGTG ACTAATGTTG TATTATTTAT TTAATTGCGG 714 0 
AAGAAATGGG CAACCTTGTC AATTGCTTCA GAAGGCTAAC TTTGATTCCA TAAACGCTTT 72 00 

10 GGAAATGAGA GGCTATTCCC AAGGACATGA ATTATACTTC AGTGTGTTCT GTACATGTAT 72 60 
TTGTAATAGT GGTTTAACTT AAATTCCTGC ACTGCTATGG AATCTCACTG TATGTTGTAG 732 0 
TGTACACATC CACAAACAAG TAATCCTGAG CTTTCAACTC ATGAGAAAAT AGAGTCCGCT 73 8 0 

15 TCTGCCAGCA TTAACTGTTC ACAGTTCTAA TTTGTGTAAC TGTGAAATTG TTCAGGTCAG 7 44 0 
TGGAATGCCT ACATTTTGCA TCCCTGTTCC AGATGGTGGT GTTGGTTTTG ACTACCGCCT 7 500 

2 0 GCATATGGCT GTAGCAGATA AATGGATTGA ACTCCTCAAG TAAGTGCAGG AATATTGGTG 7 56 0 
ATTACATGCG CACAATGATC TAGATTACAT TTTCTAAATG GTAAAAAGGA AAATATGTAT 7 62 0 
GTGAATATCT AG AC ATTTG C CTGTTATCAG CTTGAATACG AGAAGTCAAA TACATGATTT 76 80 

2 5 

AAATAGCAAA TCTCGGAAAT GTAATGGCTA GTGTCTTTAT GCTGGGCAGT GTACATTGCG 77 40 
CTGTAGCAGG CCAGTCAACA CAGTTAGCAA TATTTTCAGA AACAATATTA TTTATATCCG 7 800 

3 0 TATATGAGAA AGTTAGTATA TAAACTGTGG TCATTAATTG TGTTCACCTT TTGTCCTGTT 7860 

TAAGGATGGG CAGTAGGTAA TAAATTTAGC CAGATAAAAT AAATCGTTAT TAGGTTTACA 7920 
AAAGGAATAT ACAGGGTCAT GTAGCATATC TAGTTGTAAT TAATGAAAAG GCTGACAAAA 7980 

35 GGCTCGGTAA AAAAAACTTT ATGATGATCC AGATAGATAT GCAGGAACGC GACT AAAGC T 8040 
CAAATACTTA TTGCTACTAC ACAGCTGCCA ATCTGTCATG ATCTGTGTTC TGCTTTGTGC 8100 

40 TATTTAGATT TAAATACTAA CTCGATACAT TGGCAATAAT AAACTTAACT ATTCAACCAA 8160 
TTTGGTGGAT ACCAGAATTT CTGCCCTCTT GTTAGTAATG ATGTGCTCCC TGCTGCTGTT 82 2 0 
CTCTGCCGTT ACAAAAGCTG TTTTCAGTTT TTTGCATCAT TATTTTTGTG TGTGAGTAGT 82 8 0 

45 

TTAAGCATGT TTTTTGAAGC TGTGAGCTGT TGGTACTTAA TACATTCTTG GAAGTGTCCA 83 40 
AATATGCTGC AGTGTAATTT AGCATTTCTT TAACACAGGC AAAGTGACGA ATCTTGGAAA 8400 
50 ATGGGCGATA TTGTGCACAC CCTAACAAAT AGAAGGTGGC TTGAGAAGTG TGTAACTTAT 84 6 0 
GCAGAAAGTC ATGATCAAGC ACTAGTTGGT GACAAGACTA TTGCATTCTG GTTGATGGAT 8 520 
AAGGTACTAG CTGTTACTTT TGGACAAAAG AATTACTCCC TCCCGTTCCT AAATATAAGT 8580 

55 

CTTTGTAGAG ATTCCACTAT GGACCACATA GTATATAGAT GCATTTTAGA GTGTAGATTC 8640 
ACTCATTTTG CTTCGTATGT AGTCCATAGT GAAATCTCTA CAGAGACTTA TATTTAGGAA 8700 
60' CGGAGGGAGT ACATAATTGA TTTGTCTCAT CAGATTGCTA GTGTTTTCTT GTGATAAAGA 87 60 
TTGGCTGCCT CACCCATCAC CAGCTATTTC CCAACTGTTA CTTGAGCAGA ATTTGCTGAA 88 20 
AACGTACCAT GTGGTACTGT GGCGGCTTGT GAACTTTGAC AGTTATGTTG CAATTTTCTG 88 80 

65 

TTCTTATTTA TTTGATTGCT TATGTTACCG TTCATTTGCT CATTCCTTTC CGAGACCAGC 8940 
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CAAAGTCACG TGTTAGCTGT GTGATCTGTT ATCTGAATCT TGAGCAAATT TTATTAATAG 9 000 
GCTAAAATCC AACGAATTAT TTGCTTGAAT TTAAATATAC AG AC GT AT AG TCACCTGGCT 9060 
5 CTTTCTTAGA TGATTACCAT AGTGCCTGAA GGCTGAAATA GTTTTGGTGT TTCTTGGATG 9120 
CCGCCTAAAG GAGTGATTTT TATTGGATAG ATTCCTGGCC GAGTCTTCGT TACAACATAA 9180 
^ CATTTTGGAG ATATGCTTAG TAACAGCTCT GGGAAGTTTG GTCACAAGTC TGCATCTACA 9 240 
CGCTCCTTGA GGTTTTATTA TGGCGCCATC TTTGTAACTA GTGGCACCTG TAAGGAAACA 93 00 
CATTCAAAAG GAAACGGTCA CATCATTCTA ATCAGGACCA CCATACTAAG AGCAAGATTC 93 60 
15 TGTTCCAATT TTATGAGTTT TTGGGACTCC AAAGGGAACA AAAGTGTCTC ATATTGTGCT 9420 
TATAACTACA GTTGTTTTTA TACCAGTGTA GTTTTATTCC AGGACAGTTG ATACTTGGTA 9 480 
CTGTGCTGTA AATTATTTAT CCGACATAGA ACAGCATGAA CATATCAAGC TCTCTTTGTG 9 540 
CAGGATATGT ATGATTTCAT GGCTCTGGAT AGGCTTCAAC TCTTCGCATT GATCGTGGCA 9 600 
TAGCATTACA TAAAATGATC AGGCTTGTCA CCATGGGTTT AGGTGGTGAA GGCTATCTTA 9 660 

2 5 ACTTCATGGG AAATGAGTTT GGGCATCCTG GTCAGTCTTT ACAACATTAT TGCATTCTGC 9720 

ATGATTGTGA TTTACTGTAA TTTGAACCAT GCTTTTCTTT CACATTGTAT GTATTATGTA 9780 
ATCTGTTGCT TCCAAGGAGG AAGTTAACTT CTATTTACTT GGCAGAATGG ATAGATTTTC 9 840 
CAAGAGGCCC ACAAACTCTT CCAACCGGCA AAGTTCTCCC CTGGAAATAA CAATAGTTAT 9900 
GATAAATGCC GCCGTAGATT TGATCTTGTA AGTTTTAGCT GTGCTATTAC ATTCCCTCAC 996 0 

3 5 TAGATCTTTA TTGGCCATTT ATTTCTTGAT GAAATCATAA TGTTTGTTAG GAAAGATCAA 10020 

CATTGCTTTT GTAGTTTTGT AGACGTTAAC ATAAGTATGT GTTGAGAGTT GTTGATCATT 10080 
AAAAATATCA TGATTTTTTG CAGGGAGATG CAGATTTTCT TAGATATCGT GGTATGCAAG 10140 
AGTTCGATCA GGCAATGCAG CATCTTGAGG AAAAATATGG GGTATGTCAC TGGTTTGTCT 102 00 
TTGTTGCATA ACAAGTCACA GTTTAACGTC AGTCTCTTCA AGTGGTAAAA AAAGTGTAGA 102 60 
45 ATTAATTCCT GTAATGAGAT GAAAACTGTG CAAAGGCGGA GCTGGAATTG CTTTTCACCA 103 20 
AAACTATTTT CTTAAGTGCT TGTGTATTGA TACATATACC AGCACTGACA ATGTAACTGC 103 80 
AGTTTATGAC ATCTGAGCAC CAGTATGTTT CACGGAAACA TGAGGAAGAT AAGGTGATCA 10440 
TCCTCAAAAG AGGAGATTTG GTATTTGTTT TCAACTTCCA CTGGAGCAAT AGCTTTTTTG 10500 
ACTACCGTGT TGGGTGTTCC AAGCCTGGGA AGTACAAGGT ATGCTTGCCT TTTCATTGTC 105 60 
55 CACCCTTCAC CAGTAGGGTT AGTGGGGGCT TCTACAACTT TTAATTCCAC ATGGATAGAG 10620 

TTTGTTGGTC GTG CAGCTAT CAATATAAAG A AT AGGGTAA TTTGTAAAGA AAAGA ATT TG 10680 

CTCGAGCTGT TGTAGCCATA GGAAGGTTGT TCTTAACAGC CCCGAAGCAC ATACCATTCA 10740 

D U 

TTCATATTAT CTACTTAAGT GTTTGTTTCA ATCTTTATGC TCAGTTGGAC TCGGTCTAAT 10800 
ACTAGAACTA TTTTCCGAAT CTACCCTAAC CATCCTAGCA GTTTTAGAGC AGCCCCATTT 10860 
65 GGACAATTGG CTGGGTTTTT GTTAGTTGTG ACAGTTTCTG CTATTTCTTA ATCAGGTGGC 10920 
CTTGGACTCT GACGATGCAC TCTTTGGTGG ATTCAGCAGG CTTGATCATG ATGTCGACTA 109 80 
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CTTCACAACC GTAAGTCTGG GCTCAAGCGT CACTTGACTC GTCTTGACTC AACTGCTTAC 11040 
AAATCTGAAT CAACTTCCCA ATTGCTGATG CCCTTGCAGG AACATCCGCA TGACAACAGG 11100 
5 CCGCGCTCTT TCTCGGTGTA CACTCCGAGC AGAACTGCGG TCGTGTATGC CCTTACAGAG 11160 
TAAGAACCAG CAGCGGCTTG TTACAAGGCA AAGAGAGAAC TCCAGAGAGC TCGTGGATCG 11220 
TGAGCGAAGC GACGGGCAAC GGCGCGAGGC TGCTCCAAGC GCCATGACTG GGAGGGGATC 112 80 

10 

GTGCCTCTTC CCCAGATGCC AGGAGGAGCA GATGGATAGG TAGCTTGTTG GTGAGCGCTC 113 40 
GAAAGAAAAT GGACGGGCCT GGGTGTTTGT TGTGCTGCAC TGAACCCTCC TCCTATCTTG 11400 
15 CACATTCCCG GTTGTTTTTG TACATATAAC TAATAATTGC CCGTGCGCTC AACGTGAAAA 11460 
TCC 11463 

(2) INFORMATION FOR SEQ ID NO: 1 1 : 

2 0 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2662 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

3 0 (iv) ANTI-SENSE: 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: triticum tauschii 
(F) TISSUE TYPE: Endosperm 

35 

(ix) FEATURE: 

(A) NAME/KEY: misc_feature 

(B) LOCATION: 1 ..265 1 

(D) OTHER INFORMATION :/product= "nucleotide sequence of 

4 0 cDNA wheat SSS I" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

TCTCCCACTC TTCTCTCCCC GCGCACACCG AGTCGGCACC GGCTCATCAC CCATCACCTC 60 

45 

GGCCTCGGCC ACCGGCAAAC CCCCCGATCC GCTTTTGCAG GCAGCGCACT AAAACCCCGG 120 

GGAGCGCGCC CCGCGGCAGC AGCAGCACCG CAGTGGGAGA GAGAGGCTTC GCCCCGGCCC 180 

50 GCACCGAGCG GGGCGATCCA CCGTCCGTGC GTCCGCACCT CCTCCGCCTC CTCCCCTGTC 2 40 

CCGCGCGCCC ACACCCATGG CGGCGACGGG CGTCGGCGCC GGGTGCCTCG CCCCCAGCGT 3 00 

CCGCCTGCGC GCCGATCCGG CGACGGCGGC CCGGGCGTCC GCCTGCGTCG TCCGCGCGCG 3 60 

55 

GCTCCGGCGC TTGGCGCGGG GCCGCTACGT TGCCGAGCTC AGCAGGGAGG GCCCCGCGGC 42 0 

GCGCCCCGCG CAGCAGCAGC AACTGGCCCC GCCGCTCGTG CCAGGCTTCC TCGCGCCGCC 4 80 

60 GCCGCCCGCG, CCCGCCCAGT CGCCGGCCCC GACGCAGCCG CCCCTGCCGG ACGCCGGCGT 540 

GGGGGAACTC GCGCCCGACC TCCTGCTCGA AGGGATTGCT GAGGATTCCA TCGACAGCAT 600 
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AATTGTGGCT GCAAGTGAGC AGGATTCTGA GATCATGGAT GCGAATGAGC AACCTCAAGC 660 
TAAAGTTACA CGTAGCATCG TG TTTGTG AC TGGTGAAGCT GCTCCTTATG CAAAGTCAGG 720 
GGGGCTGGGA GATGTTTGTG GTTCGTTACC AATTGCTCTT GCTGCTCGTG GTCACCGTGT 780 
GATGGTTGTA ATGCCAAGAT ACTTGAATGG GTCCTCTGAT AAAAACTATG CAAAGGCATT 840 

10 ATACACTGGG AAGC AC ATT A AGATTCCATG CTTTGGGGGA TCACATGAAG TGACCTTTTT 900 
TCATGAGTAT AGAGACAACG TCGATTGGGT GTTTGTCGAT CATCCGTCAT ATCATAGACC 960 
AGGAAGTTTA TATGGAGATA ATTTTGGTGC TTTTGGTGAT AATC AGTTC A GATACACACT 102 0 
CCTTTGCTAT GCTGCATGCG AGGCCCCACT AATCCTTGAA TTGGGAGGAT ATATTTATGG 1080 
ACAGAATTGC ATGTTTGTTG TGAACGATTG GCATGCCAGC CTTGTGCCAG TCCTTCTTGC 1140 

20 TGCAAAATAT AGACCATACG GTGTTTACAG AGATTCCCGC AGCACCCTTG TTATACATAA 1200 
TTTAGCACAT CAGGGTCTGG AGCCTGCAAG TACATATCCT GATCTGGGAT TGCCACCTGA 12 60 
ATGGTATGGA GCTTTAGAAT GGGTATTTCC AGAATGGGCA AGGAGGCATG CCCTTGACAA 13 2 0 
GGGTGAGGCA GTTAACTTTT TGAAAGGAGC AGTCGTGACA GCAGATCGAA TTGTGACCGT 13 80 
CAGTCAGGGT TATTCATGGG AGGTCACAAC TGCTGAAGGT GGACAGGGCC TCAATGAGCT 14 40 

3 0 CTTAAGCTCC CGAAAAAGTG TATTGAATGG AATTGTAAAT GGAATTGACA TTAATGATTG 1500 

GAACCCCACC ACAGACAAGT GTCTCCCTCA TCATTATTCT GTCGATGACC TCTCTGGAAA 156 0 
^ GGCCAAATGT AAAGCTGAAT TGCAGAAGGA GCTGGGTTTA CCTGTAAGGG AGGATGTTCC 1620 
TCTGATTGGC TTTATTGGAA GACTGGATTA CCAGAAAGGC ATTGATCTCA TTAAAATGGC 1680 
CATTCCAGAG CTCATGAGGG AGGACGTGCA GTTTGTCATG CTTGGATCTG GGGATCCAAT 17 40 

4 0 TTTTGAAGGC TGGATGAGAT CTACCGAGTC GAGTTACAAG GATAAATTCC GTGGATGGGT 1800 

TGGATTTAGT GTTCCAGTTT CCCACAGAAT AACTGCAGGT TGCGATATAT TGTTAATGCC 1860 
^ ATCCAGGTTT GAACCTTGTG GTCTTAATCA GCTATATGCT ATGCAATATG GTACAGTTCC 19 2 0 
TGTAGTTCAT GGAACTGGGG GCCTCCGAGA CACAGTCGAG ACCTTCAACC CTTTTGGTGC 19 80 
AAAAGGAGAG GAGGGTACAG GGTGGGCGTT .CTCACCGCTA ACCGTGGACA AGATGTTGTG 2040 
50 GGCATTGCGA ACCGCGATGT CGACATTCAG GGAGCACAAG CCGTCCTGGG AGGGGCTCAT 2100 
GAAGCGAGGC ATGACGAAAG ACCATACGTG GGACCATGCC GCCGAGCAGT ACGAGCAGAT 2160 
CTTCGAATGG GCCTTCGTGG ACCAACCCTA CGTCATGTAG ACGGGGACTG GGGAGGTCGA 2220 
AGCGCGGGTC TCCTTGAGCT CTGAAGACAT GTTCCTCATC CTTCCGCGGC CCGGAAGGAT 22 80 

AGGGGTGTAG-ATTGGGTTGT-eeTGeTACAG-TAGAGTCGCA- ATGCGCCTGC" TTGCTTGGTC~2 3~4 0~ ~ 

60 CGCCGGTTCG AGAGTAGATG ACGGCTGTGC TGCTGCGGCG GTGACAGCTT CGGGTGGATG 2400 
ACAGTTACAG TTTTGGGGAA TAAGGAAGGG ATGTGCTGCA GGATGGTTAA CAGCAAAGCA 2460 
CCACTCAGAT GGCAGCCTCT CTGTCCGTGT TACAGCTGAA ATC AG AAACC AACTGGTGAC 2520 
TCTTTAGCCT TAGCGATTGT GAAGTTTGTT GCATTCTGTG TATGTTGTCT TGTCCTTAGC 2580 



65 
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TGACAAATAT TAGACCTGTT GGAGAATTTT ATTTATCTTT GCTGCTGTTG TTTTTGTTTT 2 640 
GTTAAAAAAA AAAAAAAAAA AA 2662 

5 (2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 768 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
1 0 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 



15 



30 



45 



(vi) ORIGINAL SOURCE: 
(A) ORGANISM: triticum tauschii 



(ix) FEATURE: 
2 0 (A) NAME/KEY: Protein 
(B) LOCATION: 1.. 768 

(ix) FEATURE: 
(A) NAME/KEY: Protein 
2 5 (B) LOCATION: 1.. 768 

(D) OTHER INFORMATION :/product= "deduced amino acid 
sequence SBE II" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Met Ala Thr Phe Ala Val Ser Gly Ala Thr Leu Gly Val Ala Arg Pro 
15 10 15 



Pro Ala Ala Ala Gin Pro Glu Glu Leu Gin lie Pro Glu Asp lie Glu 
35 20 25 30 

Glu Gin Thr Ala Glu Val Asn Met Thr Gly Gly Thr Ala Glu Lys Leu 
35 40 45 

40 Glu Ser Ser Glu Pro Thr Gin Gly He Val Glu Thr He Thr Asp Gly 

50 55 60 



Val Thr Lys Gly Val Lys Glu Leu Val Val Gly Glu Lys Pro Arg Val 

65 70 75 80 

Val Pro Lys Pro Gly Asp Gly Gin Lys He Tyr Glu He Asp Pro Thr 
85 90 95 



Leu Lys Asp Phe Arg Ser His Leu Asp Tyr Arg Tyr Ser Glu Tyr Arg 
50 100 105 HO 

Arg He Arg Ala Ala He Asp Gin His Glu Gly Gly Leu Glu Ala Phe 
115 120 125 

55 Ser Arg Gly Tyr Glu Lys Leu Gly Phe Thr Arg Ser Ala Glu Gly He 

130 135 140 



Thr Tyr Arg Glu Trp Ala Pro Gly Ala His Ser Ala Ala Leu Val Gly 
145 150 155 160 



60 
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Asp Phe Asn Asn Trp Asn Pro Asn Ala Asp Thr Met Thr Arg Asp Asp 
165 170 175 

Tyr Gly Val Trp Glu He Phe Leu Pro Asn Asn Ala Asp Gly Ser Pro 
180 185 190 

Ala He Pro His Gly Ser Arg Val Lys He Arg Met Asp Thr Pro Ser 
195 200 205 

Gly Val Lys Asp Ser He Ser Ala Trp He Lys Phe Ser Val Gin Ala 
210 215 220 

Pro Gly Glu lie Pro Phe Asn Gly He Tyr Tyr Asp Pro Pro Glu Glu 
225 "0 235 240 

Glu Lys Tyr Val Phe Gin His Pro Gin Pro Lys Arg Pro Glu Ser Leu 
245 250 255 



Arg He Tyr Glu Ser His He Gly Met Ser Ser Pro Glu Pro Lys He 
zu 260 265 270 

Asn Ser Tyr Ala Asn Phe Arg Asp Glu Val Leu Pro Arg He Lys Arq 
275 280 285 



Leu Gly Tyr Asn Ala Val Gin He Met Ala He Gin Glu His Ser Tyr 
290 295 300 

Tyr Ala Ser Phe Gly Tyr His Val Thr Asn Phe Phe Ala Pro Ser Ser 
305 31 0 315 320 

Arg Phe Gly Thr Pro Glu Asp Leu Lys Ser Leu He Asp Arg Ala His 
325 330 335 

Glu Leu G1 V Leu Leu Leu Met Asp He Val His Ser His Ser Ser 

Jb 340 345 350 



Asn Asn Thr Leu Asp Gly Leu Asn Gly Phe Asp Gly Thr Asp Thr His 
355 360 365 

Tyr Phe His Gly Gly Pro Arg Gly His His Trp Met Trp Asp Ser Arg 
370 375 380 

Leu Phe Asn Tyr Gly Ser Trp Glu Val Leu Arg Phe Leu Leu Ser Asn 
385 390 395 400 

Ala Arg Trp Trp Leu Glu Glu Tyr Lys Phe Asp Gly Phe Arg Phe Asp 
405 410 415 

Gly Val Thr Ser Met Met Tyr Thr His His Gly Leu Gin Met Thr Phe 
420 425 430 

Thr Gly Asn Tyr Gly Glu Tyr Phe Gly Phe Ala Thr Asp Val Asp Ala 
435 440 445 

-Va-l-Va-l-Tyr-Leu-Met-Leu-Var A¥n"Asp Leu He TUs"~GlV1e"u ~His Pro" 
450 455 460 

Asp Ala Val Ser He Gly Glu Asp Val Ser Gly Met Pro Thr Phe Cys 
465 4 ?0 475 480 

He Pro Val Pro Asp Gly Gly Val Gly Phe Asp Tyr Arg Leu His Met 
485 490 495 
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Ala Val Ala Asp Lys Trp lie Glu Leu Leu Lys Gin Ser Asp Glu Ser 
500 505 510 

Trp Lys Met Gly Asp He Val His Thr Leu Thr Asn Arg Arg Trp Leu 
5 515 520 525 

Glu Lys Cys Val Thr Tyr Ala Glu Ser His Asp Gin Ala Leu Val Gly 
530 535 540 

10 Asp Lys Thr He Ala Phe Trp Leu Met Asp Lys Asp Met Tyr Asp Phe 

545 550 555 ' 560 



15 



30 



45 



60 



Met Ala Leu Asp Arg Pro Ser Thr Pro Arg He Asp Arg Gly He Ala 
565 570 575 

Leu His Lys Met He Arg Leu Val Thr Met Gly Leu Gly Gly Glu Gly 
580 585 590 



Tyr Leu Asn Phe Met Gly Asn Glu Phe Gly His Pro Glu Trp He Asp 
20 595 600 605 

Phe Pro Arg Gly Pro Gin Thr Leu Pro Thr Gly Lys Val Leu Pro Gly 
610 615 620 

2 5 Asn Asn Asn Ser Tyr Asp Lys Cys Arg Arg Arg Phe Asp Leu Gly Asp 

625 630 635 640 



Ala Asp Phe Leu Arg Tyr His Gly Met Gin Glu Phe Asp Gin Ala Met 

645 650 655 

Gin His Leu Glu Glu Lys Tyr Gly Phe Met Thr Ser Glu His Gin Tyr 
660 665 670 



Val Ser Arg Lys His Glu Glu Asp Lys Val He He Phe Glu Arg Gly 

35 675 680 685 

Asp Leu Val Phe Val Phe Asn Phe His Trp Ser Asn Ser Phe Phe Asp 

690 695 ' 700 

40 / Tyr Arg Val Gly Cys Ser Arg Pro Gly Lys Tyr Lys Val Ala Leu Asp 

705 710 715 720 



Ser Asp Asp Ala Leu Phe Gly Gly Phe Ser Arg Leu Asp His Asp Val 

725 730 735 

Asp Tyr Phe Thr Thr Glu His Pro His Asp Asn Arg Pro Arg Ser Phe 

740 745 750 



Ser Val Tyr Thr Pro Ser Arg Thr Ala Val Val Tyr Ala Leu Thr Glu 
50 755 760 765 

(2) INFORMATION FOR SEQ ID NO: 13: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 10550 base pairs 
5 5 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: both 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
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(vi) ORIGINAL SOURCE: 

(A) ORGANISM: triticum tauschii 

(ix) FEATURE: 
5 (A) NAME/KEY: exon 

(B) LOCATION: 1..3 1 6 

(D) OTHER INFORMATION:/product= "exon P 

(ix) FEATURE: 
10 (A) NAME/KEY: exon 

(B) LOCATION: 1472.. 1 828 

(D) OTHER INFORMATION:/product= "exon 2" 

(ix) FEATURE: 
15 (A) NAME/KEY: exon 

(B) LOCATION.-2766..2823 

(D) OTHER INFORMATION:/product= "exon 3" 

(ix) FEATURE: 
2 0 (A) NAME/KEY: exon 

(B) LOCATION:2906..3028 

(D) OTHER INFORMATION:/product= "exon 4" 

(ix) FEATURE: 

2 5 (A) NAME/KEY: exon 

(B) LOCATION:41 13. .4194 

(D) OTHER INFORMATION:/product= "exon 5" 

(ix) FEATURE: 

3 0 (A) NAME/KEY: exon 

(B) LOCATION:4286..4459 

(D) OTHER INFORMATION:/product= "exon 6" 

(ix) FEATURE: 

3 5 (A) NAME/KEY : exon 

(B) LOCATION:4562..4643 

(D) OTHER INFORM ATION:/product= "exon 7" 

(ix) FEATURE: 

4 0 (A) NAME/KEY : exon 

(B) LOCATION:4744..4855 

(D) OTHER INFORMATION:/product= "exon 8" 

(ix) FEATURE: 

4 5 (A) NAME/KEY: exon 

(B) LOCATION:4999..5021 

(D) OTHER INFORMATION:/product= "exon 9" 

(ix).FEAXURE: 

50 (A) NAME/KEY: exon 

(B)LOCATION:5102..5192 

(D) OTHER INFORMATION:/produci= "exon 10" 

(ix) FEATURE: 

5 5 (A) NAME/KEY: exon 

(B) LOCATION:8593..8718 
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(D) OTHER INFORMATION :/product= "exon 11" 

(ix) FEATURE: 
(A) NAME/KEY: exon 
5 (B) LOCATION:8807..89I5 

(D) OTHER INFORMATION :/product= "exon 12" 

(ix) FEATURE: 
(A) NAME/KEY: exon 
10 (B) LOCATION:8992„9104 

(D) OTHER INFORM ATION:/product= M exon 13" 

(ix) FEATURE: 
(A) NAME/KEY: exon 
IS (B) LOCATION:9161..9199 

(D) OTHER INFORMATION:/product= "exon 14" 



20 


(ix) FEATURE: 

(A) NAME/KEY: exon 

(B) LOCATION:9498..971 3 

(D) OTHER INFORMATION:/proauct= exon id 
(xi) SEQUENCE DbSCRIPIlCJN. ofc^ ID INU. 13. 




25 


— — r^r^r^n^vr^nr' cnrTTinnTCC C*TCC\CCCCC A GCGTPCGCCT 
ATGGCOOCCjA CGOOv-Cj I L-ULi L-OCL-OOLj 1 I v_vjv_i^i^\^.v_/a vjv_vj i ^,\^\j\_v_ i 


50 




r^mrmm at nnnnnr\ Apr.r. ^^.^.^^^^^1^1^ GTGGGCTTGC GTCGTCCGCG 


100 


30 


mmm-mm nrr.rTTnr.m rnnCiCiCCCiCT ACGTCGCCGA GCTCAGCAGG 


150 


GAGGGCCCCG CGGCGCGCCC CGCGCAGCAG CAGCAACTGG CCCCGCCGCT 


200 




CGTGCCAGGC TTCCTCGCGC CGCCGCCGCC CGCGCCCGCC CAGTCGCCGG 


250 


35 


CCCCGACGCA GCCGCCCCTG CCGGACGCCG GCGTGGGGGA ACTCGCGCCC 


300 




GACCTCCTGC TCGAAGGTAA AAAACAAGGC TG AATCCTCA GATCACTCCG 


350 


40 


CGTCTTCGTT TTACC AAATA CGGTACTGCG AAGTGGTGCT GTATATGTGA 


400 


AGTTTCTGTC GATTTCTTCC TGACGGATGT TCAGTCG ATT CAGTTGTATA 


450 




TATGTGATAC GTTC GTTGTT CATCGATCGT ACAGATTTAC C AGCACACTA 


500 


45 


GATAGAAATC GAGACCGACG CGGGCAGATC AATAGA 11 11 TCTAGACGTT 


550 




TTATTGGATC GTGAGATGAT TGATTGGGGT GGCGTGTCGA TACGATAGCG 


600 


50 


GTGCACCGCC GATGTATCGG GGCATGTGCA CGTGGTTGGG TCTCAGCAGA 


650 


CATATCACTA GACTGGTATC GTAATTTACT AGTACTACTG GAAAGAGGAC 


700 




TAAAAAGGCT AGGCCAAGTG CACGCATGTT GGGAACGTTG TTAAATTGAT 


750 


55 


G AGTTTGTCC TTTGCTTGGG CTGGTATTAT TACCAAAAAA TGGTGTTAGT 


800 
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CCCTGTACTT ATTAATGGGA AAATCTTAAC ATGACACTGG GGTTTATGAG 850 
TCTCCAATTG TATATTCTCA GCACTCAACT GATTTTACTG ATACTGTAGT 900 
5 GGAAATGACA CGTGAGCACC CCCCTTCAAG GAATGCAATG CTTCTTTCTG 950 

TTTTATATTA CAGGAACTAG AAGGAGCTTC CACCTTTGAG TACAGAAGTA 1000 

CTCCCTCCGT TCCAAAATAG ATGACTCAAC TTTGTACTAA TTTTGTACTA 1050 

10 

TAGTTAGTAC A AAGTTGAGT CATCTATTTT AG AACGG AGG G AGTAGTATC 1 1 00 

G AAATTG A AG ACCCTTGTAT TACTGTCTTG TTTTTCAATG AAAATGGGAG 1 150 

1 5 GCCCATGCAG TAAGTCACAT GGGCACCTGG GAGGCTGGGA TCATGTGTGC 1200 

TTTGCAGAGT ACTAGACCCA GCTCACCCTC TGTTAGATTA CTTGTTGGGC 1250 

TGCTACTTTG TGTTTGCTGT GCAGTATATC AGACATCCTG A ATTTGGC AT 1 300 

20 

CTAGCTG AGA AC AG A ATGCA GGTTGCACCA TTCTTATTAT TGCTA AACTG 1 350 

TTGTC ACGC A ATTTATAA AG AATGTG ATCT TCTG AGTATT A ATTAATCAT 1 400 

2 5 GTTCTGCTAA TATCTGTCCT CGCTCTGGTG TTGACAAATA TACCATATGA 1450 

ATATTTTCCA TTTTGCAACC AGGG ATTGCT G AGG ATTCC A TCGACAGCAT 1 500 

3 0 AA TCGTGGCT GCAAGTGAGC AGGATTCTGA GATCATGGAT GCGAATGAGC 1550 

AACCTCAAGC TAAAGTTACA CGTAGCATCG TGTTTGTGAC TGGTGAAGCT 1600 

GCTCCTTATG CAAAGTCAGG GGGGCTGGG A G ATGTTTGTG GTTCGTTACC 1 650 

3 5 AATTGCTCTT GCTGCTCGTG GTCACCGTGT GATGGTTGTA ATGCCAAGAT 1700 

ACTTGAATGG GTCCTCTGAT AAAAACTATG CAAAGGCATT ATACACTGCG 1750 

A AGCACATTA AG ATTCCATG CTTTGGGGGA TCACATG AAG TGACCTTTTT 1 800 

-40 

TC ATGAGTAT AG AG ACA ACG TCG ATTGGGT GGGTACAC A A TCACCTTCTT 1 850 

ATTCf CTGTT G A ATTGTAGC AACTGTTTAT CCTTGTTTAC ACTTCTTTTA 1 900 

4 5 GCCCTGCAAA GACATATGTG ATTTCCATAC TTTTTTGTTA TTTCCCTTGT 1950 

ACTCTTGCTC ATGAAGGTCA AAATATCATA TATCCATGGA AGTCATGCAT 2000 

-GTGCCTAGTA-TXnTGGTGT-GGGTGGGTTT-AAG — — 2050" 



50 

TGGA ATTTG A TA ACTA AAGT TTATTTTATT GAAAAA A ATT GTAGGTTGG 2 1 00 

TG AGCCCAC A GCC ACGCAGT GGCACCACTG CTTGCAC ATG ATTTTGCATT 2 1 50 

5 5 TCTGTTTGCA CCGAGCACTT CATGTGAATA AGGTGTAAAA TCATAAAGTA 2200 
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CCAATTTTAT TCTGCCAATT GCACTTAAGA GTATATACAT TTATCTTGGC 2250 

CTCAATCATG GGAGTACTGT GCATTCAGTG CACC ATCATT GTTCTAAGGA 2300 

5 GAAAATGTGG GTGCAAGGAA GACACTTTTG TCCCTTAATA AAAGGCAGGC 2350 

ACTCTGTTGT CATATAGATA GAAAGCAACA AACTTATTTC AAAGAGCTAA 2400 

CAATGGCAAA AGAACCAAAA AAAGCATGCT AAGGCGGTGA C ACCAAAAGG 2450 

10 

TGAGGGGGGC CTTGTGACTG ACAGCACCCC AAACTATTGC CATTGTTTTA 2500 

CTAAATGAAG ATCATTTTAG AAGCTCTCAG GAACTTCGAA AACAGTGGCT 2550 

1 5 TTCCGTCCAC AGATCGTCTG TT A AT ATTTT TGTCCAGTG A TACTTTTTTT 2600 

GCTCCTTACA AGAGTGCCTA TGTTGACATA T AC ATTGTT A AGTTGTTCAT 2650 

AAGTTTACTT CTTATTCTAA ACAGCAAGTG CCTAATGCTT GC ATTTATTT 2700 

20 

TGGCTATTTA TTTTTATTCT CATTTCAATC AAC ACTTTTG TTC AGGTGTT 2750 

TGTCGATCAT CCGTCATATC ATAGACCAGG AAGTTTATAT GGAGATAATT 2800 

2 5 TTGGTGCTTT TGGTGATAAT CAGGTACACT ACACTATACT AAGCTCCTAG 2850 

TTGACTAAGT CGTAAGTTGT ACCTCCTCGC TGACCGGCTG CTCTATGTCG 2900 

TGCAGTTCAG ATACACACTC CTTTGCTATG CTGCATGCGA GGCCCCACTA 2950 

30 

ATCCTTGAAT TGGGAGGATA TATTTATGGA CAG AATTGCA TGTTTGTTGT 3000 

GAACGATTGG CATGCCAGCC TTGTGCCAGT GTACGTTGTT TGTGGATCTG 3050 

3 5 A A AGTCCA AT CCTTTATTC A TTCTCTGCTT TGC AGTGTGC CCATGTCTAC 3 1 00 

ATTTCTTTTA TGCTTTTTTC ATGTCTGTTC TTATATTGCA TATATGCTTA 3 1 50 

TGGAGTCTAA AAGTT ACCGG AGGGAATAAC TCTTAAGG AT TTCCTCAATC 3200 

40 

AATTATCTTT AGCTTTAGTT AACATTTACT GTGGC AAACA TAATGTGTTT 3250 

TGAGATTTAC AAGTTCAGAG ATTGCACTTC ACTAGTTCGT AGCTAATCTG 3300 

4 5 ATGTTTTCCC CGAGAAAATG CCTAAAGCTT TGTGTCTTG A TGCATTGATA 3350 

GAAAAAG AGT TTATGTACAC TCCCAAAGAG GGG ACCCAAA ATTACAACAC 3400 

CACACCCCTG AGAACTAGGC GCTGCCGGAA GAAGCGATCC AAGCCCCACT 3450 

50 

GCCCCTGCCT TAGCTCAAAG CCGGGCGTCA GCTTGATTGT GTCAAGTAAG 3500 

CTAGCAGTGC TAGATTGCGC AAGGT CGATT CGTCGAAGAT GACAGTGTTG 3550 

5 5 CGCTGCTTCC AAATCCACCA A ACTATG AGC ATGATCACTG G AG AAGTACC 3600 
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TTTTCTCGCG GCTGAGGGGG TGGACTGGTG GTCTGCTGCT GCCAGTTTTC 3650 

AGATAATCTG AAAAATGCAT GTTTTGATGA TTTTAGTATC TTGCGGACCC 3700 

5 TGGGTACCAC CTAAGCTTTC ACACAGTAAT TTGCAGTTAC ACCTATAAAA 3750 

GTAACGGTCA TG AT ATGC AT GTGTTTTGGG TAGATCATGG TGCATGCATT 3800 

TTAGGAATTA GGACATGCCA GAACCACGTG AGGCTTATGG GGCAATTCAT 3850 

10 

TTGTTCCATT ATACG AGTCA TGAATATGGT TCAGCATGTT TGGACGCTAC 3900 

TTGTTTGGGG CAATTTCAGA TGGTGAATTG TAGCTGCTTG ATGTTGGCTA 3950 

15 GCTGGCTTAT TTTGTACAAG TATCG ATGTT AGATGCATAT TTCCTTTTGT 4000 

TCTTGTGCTG TTTGCCATGT TGTATTCCCC TTTTCTGTCG CCAGTGTTGC 4050 

ATGTTAA ATT GGTTTTCATT AC ATAATC AA CTTTGTTGCT G ACATCAGTC 4 1 00 

20 

All l 1 lATTC AGCCTTCTTG CTGC A A A ATA TAG ACC ATAC GGTGTTTAC A 4 1 50 

GAGATTCCCG CAGCACCCTT GTTATACATA ATTTAGCACA TCAGGTTTGG 4200 

2 5 GTCTATCACC TTTCATTATC CGTACATGGC TTTGTAAGTC GGTTCACACG 4250 

TATCGTCATA CTGTATGTTA TTTCAATGTC ATTAGGGTGT GGAGCCTGCA 4300 

AGTACATATC CTGATCTGGG ATTGCCACCT GAATGGTATG GAGCTTTAGA 4350 

30 

ATGGGTATTT CCAGAATGGG CAAGGAGGCA TGCCCTTGAC AAGGGTGAGG 4400 

CAGTTAACTT TTTGAAAGGA GCAGTTGTGA CAGCAGATCG AATTGTGACC 4450 

3 5 GTCAGTCAGG TGAAATACTC AATACTTCTC TTTTTTCTTT GCGGGATGTT 4500 

CTTCAGTTCA ATTGCCCTGT CTTTCACCCA ATTAAGAAAT GATTTAATCT 4550 

TTTGTTTCTA GGGTTATTCA TGGGAGGTCA CAACTGCTGA AGGTGGACAG 4600 

40 

GGCCTCAATG AGCTCTTAAG CTCCCGAAAA AGTGTATTGA ATGGTAACTA 4650 

TATTTGAATC C ACTTATCTT CTTCTG A AAC ATATTTACAG A AATAGATGG 4700 

4 5 ATGGGTTGCA AGAATAAATT CAGTTTGCTC TTTCGGTATG AAGGAATTGT 4750 

AAATGGAATT GACATTAATG ATTGGAACCC CACCACAGAC AAGTGTCTCC 4800 

-CTCATCATTA-TTGTGTGGAT-GAGGTGTGTG GAAAGGTGTG TGGATAGTAC 4850" 



50 

CCTATATAAT A AC ATGTATA TCTGATCTAG TACTTTCTTT TTCTTTGCTA 4900 

GTTTGCTTCC CATGATGTTC TCACTAACTA ATCCTATGTG GTTTGGCATA 4950 

55 CTTGTCAGGC CAAATGTAAA GCTGAATTGC AGAAGGAGCT GGGTTTACCT 5000 
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GTAAGGGAGG ATGTTCCTCT GGTTAGATAC AAACCCCTAA GATATATATT 5050 

TTTTAAATCC CTAA A AAA AA CTTGCCGATC ATCTCATTAG CTTGATTCAC 5 1 00 

5 AG ATTGGCTT TATTGGAAGA CTGGATTACC AG A A AGGC AT TG ATCTCATT 5 1 50 

AAAATGGCCA TTCCAGAGCT CATGAGGGAG GACGTGCAGT TTGTAAGTTC 5200 

ATATTCTTTT TCTTG AGACT AGAGTATAAA TCAAACATGT AGGTGTGGGG 5250 

10 

TGGTATAATA CAGACATAAG TTCCAGCTAT TGCTTCCATG AGAATTTTAA 5300 

TGCTATTCAG TAATATGCTA CTGCAAGTTT TGAAACAAAG TTGG AAGCAA 5350 

15 TAAATATATG TGTAGCACTG ACCATGCAGT GCCACTATAG CTGGAATGTC 5400 

CTGTAGTCTA TGTGATCTAA CACACTCAAC AACATGTTTT CGCATACAAA 5450 

CACATGCGTG CGCGCAACAA ACATACTCTA CAATAAAATT GGCTTGGTGA 5500 

20 

ACTGC AG AC A TGCTCTTATC TCCATTCCAA CATTTCTTGT TTCAACATTG 5550 

GCTGAAGACT AAGAGAAGGG GGACCCAGGG TGATGTAGCC A ACT AG ATCC 5600 

2 5 AGTAAGGAAG CTAGCCGAGC CTAGGAGG AT TCGCTTAGGT AGCTGGAACG 5650 

TAGGGTCTCT GACAGGGAAG CTTCGGGAGC TAGTCGATGC AGTGGTGAGG 5700 

AGAGGTGTTG ATATCCTTTG CGTCCAAGAA ACCAAATGTA GGGGACAGAA 5750 

30 

GGCGAAGGAG GTGGAGGATA CCGGCTTCAA GCTGTGGTAC ATGGGACGGC 5800 

TGCAAACAGA AATGGCGTAG GCATCTTG AT CAACAAGAGC CTTAAGTATG 5850 

3 5 GAGTGGTAGA CGTCAAGAGA CGTGGGGACC GGATTATCCT CGTCAAGCTG 5900 

GTAGTTGGGG ACTTAGTTCT CAATGTTATC AGCGTGTATG CCCCGCAAGT 5950 

AGGCCACAAT GAGAACGCCA AGAGGGAGTT CTGGGAAGGC CTGGAAGACA 6000 

40 

TGGTTAGGAG TGTACCG ATT GGCGAGAAGC TCTTCATAGG AGGAGACCTC 6050 

A ATGGCCACG TGGGTAC ATC TAAC ATAGGT TTTG A AGGGG C AC ATGGGGG 6 1 00 

4 5 CTTTGGCTAT GGC ATCA AG A ATCA AG AAG A AG ATGTCTT A CGCTTTGCTC 6 1 50 

TAGCCTACGA CATGATTGTA GCTAACACCC TCTTTAGAAA GAGAGAATCA 6200 

CATCTGGTGA CTTTTAGTAG TGGCCAACAC TAGCCAGATC GATTTCATCC 6250 

50 

TCTCGAGAAG AGAAGATAGG TGTGCGCGCC TAGACTGCAA GGTGATACCT 6300 

TCGGATTCGT GTCCAGCGGG ATAAGCGTGC C A A AGTCGCT AGAATGAAGT 6350 

5 5 GGTGGAAGCT CAAGGGGGAG GTAGCTCAGG CGTTC AAGGA GAGGGTCATT 6400 
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AGGGAGGGCC CTTGGGAGGA AGGAGGGGAT GCGGACAATG TGTGGATGAA 6450 

GATGGCGACT TGCATTCGTA AGGTGGCCTC GGAGGAGTGT GGAGTGTCCA 6500 

5 GGGGATGGAG AAGCGAAGAT AAGGATACCT GGTGGTGGAA TGATGATGTC 7000 

CAGAAGGCAA TTAAAGAGAA GAAAGATTGC TTTAGACGCC T AT ACTTGG A 7050 

TAGGAGTGC A GTC AACATAG AAAAGTAC AA G ATGGCGA AG AAGGCCGCAA 7 1 00 

10 

AGCG AGCTGT C AGTG AAGCA AGGGGTCGGG CATATGAGG A TCTCTACCA A 7 1 50 

CGGTTAGGCA CGAAGGAAGG CGAAAGGGAC ATCTATAAGA TGGCC A AG AT 7200 

CCGAGAGAGA GGAAGACGAG GGATATTGGC CAAGTCAAAT GCATCAAGGA 7250 

1 5 TGGAGCAGAC C AACTCTTGG TGA AGGACGA GGAGATTAAG CATAGATGGC 7300 

GGGAGTACTT CGACAAGCTG TTCAATGGGG AGGATGAGAG TCCTACCATT 7350 

GAACTTGACG ACTCCTTTGA TGAGACCATC ATGCGTTTTA TGCGGCGAAT 7400 

CCAGGAGTCC GAGGTCAAGG AGGCTTTAAA AAGGAGGCAA GGCGATGGGC 7450 

CCTGATTGTA TCCCCATTGA GGTGTGGAAA GGCCTCGGGG ACATAGCGAT 7500 

2 0 AGTATGGCTA ACCAAGCTAT TCAACCTCAT TTTTCGGGCA AACAAGATGC 7550 

CAGAAGAATG GAGACGAAGT ATATTAGTAC CAATCATCAA ACAGGGGGGA 7600 

TGTTCAGAGT TGTACTAATT ACCATGGAAT TAAGCTGATG AGCCATACAA 7650 

TGAAGCTATG GGAGAGAATC ATTGAGCACC GCTTAAGAAG AATGACAAGC 7700 

GTGACCAAAA ATCAGTTTGG TTTCATGCCT GGGAGGTCGA CCATGGAAAC 7750 

2 5 CATTTTCTTG GTACGACAAC TTATGGAGAG ATACAGGGAG CAAAAGAAGG 7800 

ACTTGCATAT GGTGTTCATT G ACTTGAAG A AGGCCTATAA TAAGATACCG 7850 

CGGAATGTCA TGTGGTGGGC CTTGGAGAAA CACAAAGTCC CAGCAAAGTA 7900 

CATTACCCTC ATCAAGGACA TGTACGATAA TGTTGTGACA AGTGTTCGAA 7950 

CAAGTGATGT CGACACTAAT GACTTCCCGA TTAAGATAGG ACTGCATCAG 8000 

3 0 GGGTCAGCTT TGAGCCCTTA TCTTTTTGCC TTGGTG ATG G ATG AGGTC AC 8050 
_._ AAGGGATATA-CAAGGAGATA-TCGGATGGTG-TA-TGGTGTTT-GTGGA-TGATT 8100- 

TGGTGCTAGT TGACGATAGT CGGGCGGGGG TAA ATA AC A A GTTAGAGTTA 8150 

TGGAGACAAA CCTTGGAATC GAAAGGGTTT AGGCTTAGTA GAACTAAAAC 8200 

CGAGTACATG ATGTGCGGTT TCAGTACTAC TAGGTGTGAG GAGGAGGAGG 8250 
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TTAGCCTTGA TGGGCAGGTG GTACCCCAGA AGGACACCTT TCGATATTTG 8300 

GGGTCAATGC TGCAGGAGGA TGGGGGTATT GATGAAGATG TGAACCATCG 8350 

AATCAAAGCT GGATGGATGA AGTGGCGCCA AGCTTCTGGC ATTCTTTGTG 8400 

ACAAGAGAGT GCCACAAAAG CTAAGGCAAG TTCTACAGGA CGGCGGTTCG 8450 

5 ACCCGCAATG TTGTATGGCG CTGAGTGTTG GCCGACTAAA AGGCGACATG 8500 

TTCAACAGTT AGGTGTGGCG GAGATGCGTA TGTTGAGATG GATGTGTGGC 8550 

CACACGAGG A AGRATCGAGT CCGGAATGAT G ATATACGAG ATAGAGTTGG 8600 

GGTAGCACCA ATTGAAGAGA AGCTTGTCCA ACATCGTCTG AGATGGTTTG 8650 

GGCATATTCA GCGCACGCCT CCGAAAACTC CAGTGCATAA CGGACGGCTA 8700 

1 0 AAGCGTGCGG AGAATGTC AA G AGAGGGCGG GGTAGACCG A ATTTG ACATG 8750 

GGAGGAGTCC GTTAAGAGAG ACCTGAAGGT TTGGAGTATT ACGAAAGAAC 8800 

TAGCTATGGA CARGGGTGCG TGGAAGCTTG TTATCCATGT GCCAGAGCCA 8850 

TGAGTTGATC ACGAGATCTT ATGGGTTTCA CCTCTAGCCT ACCCCAACTT 8900 

GTTTGGGACT AAAGGCTTTG TTGTTGTTGT TGTTGTTGTT GTTGTAGCCA 8950 

1 5 ACTAAATCCA GTTGATCAGT GGTTTTTACT CTTATTTTTA CAGGTCATGC 9000 

TTGGATCTGG GGATCCAATT TTTGAAGGCT GGATGAGATC TACCGAGTCG 9050 

AGTTACA AGG ATAAATTCCG TGG ATGGGTT GG ATTTAGTG TTCCAGTTTC 9 1 00 

CCACAG AATA ACTGC AGGGT ATGCCG AG AA CTTCTTAACA AG ACCTTCGT 9 1 50 

TATCAGCTTG GATATATTAT AATGTTCAAA ACATTTATGT CTCTCTTTTT 9200 

2 0 GTGCAGTTGC GATATATTGT TAATGCCATC CACGTTTGAA CCTTGTGGTC 9250 

TTAATCAGCT ATATGCTATG CAATATGGTA CAGTTCCTGT AGTTCATGGA 9300 

ACTGGGGGCC TCCG AGTAAG ACAACTGCCT TG AAA ATT AT CGTTATCTTG 9350 

GCTCCAACGC AAATGTTCTA ATTGGCTCGT GTATTCAACA GGACAC AGTC 9400 

GAGACCTTCA ACCCTTTTGG TGCAAAAGGA GAGGAGGGTA CAGGGTACGC 9450 

2 5 ACTGCTCAAT TTTAGCTAAC TTTCAGTTTA TCTTTTTGCA ATGTCTTGGG 9500 

GGTTCATTGC GCCATAAATC A ACTTGTG AT AATTAACTGT TACTGTTCTG 9550 

TACTTGCAGG TGGGCGTTCT CACCGCTAAC CGTGGACAAG ATGTTGTGGG 9600 

TAAGTTTTTG CTGAGCTCTT GTCCGGTTAT AGGATCGACC TTGGCTGTAG 9650 
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CATGGTACCT TAGTGCCCCT TGTATATAG A CCTAACCTGA TGGACTCACT 9700 

TTGTCTACAC TAATCATAGT AGTCGATTGC CCGGAGGCGT TTTGCTTGGA 9750 

TTCTGCTAAT TTAATTTTCA TGACGATAAC TCATACCATG GTTTGGTTCT 9800 

CCGATGGGGG CCAGAATGGC GTCTAGTGTC TGCGATCTGT GTAACTAGCC 9850 

5 AATGCCGGGT TGTTCCAAGT GAAAATTTAC CTTTTGACCA TTGTGCAGGC 9900 

ATTGCGAACC GCGATGTCGA CATTCAGGGA GCACAAGCCG TCCTGGGAGG 9950 

GGCTCATG AA GCGAGGCATG ACG A A AG ACC ATACGTGGG A CC ATGCCGCC 1 0000 

GAGCAGTACG AGCAGATCTT CG AATGGGCC TTCGTGGACC AACCCTACGT 1 0050 

C ATGTAG ACG GGGACTGGGG AGGTCGAAGC GCGGGTCTCC TTGAGCTCTG 10100 

1 0 AAGACATGTT CCTCATCCTT CCGCGGCCCG GAAGGATACC CCTGTACATT 10150 

GCGTTGTCCT GCTACAGTAG AGTCGCAATG CGCCTGCTTG CTTGGTCCGC 10200 

CGGTTCG AG A GTAG ATG ACG GCTGTGCTGC TGCGGCGGTG ACAGCTTCGG 1 0250 

GTGGATGACA GTTACAGTTT TGGGG A ATA A GG A AGGG ATG TGCTGCAGGA 1 0300 

TGGTTAACAG CAAAGCACCA CTCAGATGGC AG CCTCTCTG TCCGTGTTAC 10350 

1 5 AGCTG AAATC AG A A ACC A AC TGGTG ACTCT TTAGCCTTAG CGATTGTGAA 1 0400 

GTTTGTTGCA TTCTGTGTAT GTTGTCTTGT CCTTAGCTG A C A AATATTTG 1 0450 

ACCTGTTGGA TAATTCTATC TTTGCTGCTG TTTTTCTTTT GGTCAAAAGA 10500 

GGGGTTCCCT CCG ATTTC AT TA ACG AAACC ACC A A A ATAA C AGC ACCCAG 1 0550 

TGCAGGTCTC AGGTTCAGAT ATACTTAAGA CTACTAAATC TAACAGCAGC 10600 

2 0 TA A AA AGCTT AAA G ATTC AG GCG AC ATAAC CG A AC A A A AT CC AC AACCG A 1 0650 

AGGGACC A A A GCAGG AC A AG TA A A AAGGC A GNCGAC AC AA AGCGCAGGTC 1 0700 

GCTGAAAAGG CAAGCAGACA GAGGTCTGCA TTCTGTCAAC ACCACTTGTG 10750 

AAAAATGAAG AGAAGATCGA GAATTCCCGG GAATCCG 10787 



(2) INFORMATION FOR SEQ ID NO: 14: 
2 5 (i) SEQUENCE CHARACTERISTICS: 
(A)_LENGTH:_647.amino_acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

30 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 
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(vi) ORIGINAL SOURCE: 
(A) ORGANISM: triticum tauschii 
(F) TISSUE TYPE: Endosperm 

(ix) FEATURE: 

(A) NAME/KEY: Protein 

(B) LOCATION: 1.. 647 

(D) OTHER INFORMATION:/product= "deduced amino acid 
sequence for SSS I" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

Met Ala Ala Thr Gly Val Gly Ala Gly Cys Leu Ala Pro Ser Val Arg 
15 10 15 

Leu Arg Ala Asp Pro Ala Thr Ala Ala Arg Ala Ser Ala Cys Val Val 
20 25 30 



Arg Ala Arg Leu Arg Arg Leu Ala Arg Gly Arg Tyr Val Ala Glu Leu 
20 35 40 45 

Ser Arg Glu Gly Pro Ala Ala Arg Pro Ala Gin Gin Gin Gin Leu Ala 
50 55 60 

2 5 Pro Pro Leu Val Pro Gly Phe Leu Ala Pro Pro Pro Pro Ala Pro Ala 

65 70 75 80 



Gin Ser Pro Ala Pro Thr Gin Pro Pro Leu Pro Asp Ala Gly Val Gly 
85 90 95 

Glu Leu Ala Pro Asp Leu Leu Leu Glu Gly lie Ala Glu Asp Ser lie 
100 105 110 



Asp Ser lie lie Val Ala Ala Ser Glu Gin Asp Ser Glu lie Met Asp 

35 115 120 125 

Ala Asn Glu Gin Pro Gin Ala Lys Val Thr Arg Ser lie Val Phe Val 

130 135 140 

4 0 Thr Gly Glu Ala Ala Pro Tyr Ala Lys Ser Gly Gly Leu Gly Asp Val 

145 150 155 160 



Cys Gly Ser Leu Pro lie Ala Leu Ala Ala Arg Gly His Arg Val Met 
165 170 175 

Val Val Met Pro Arg Tyr Leu Asn Gly Ser Ser Asp Lys Asn Tyr Ala 
180 185 190 



Lys Ala Leu Tyr Thr Gly Lys His lie Lys lie Pro Cys Phe Gly Gly 

50 195 200 205 

Ser His Glu Val Thr Phe Phe His Glu Tyr Arg Asp Asn Val Asp Trp 

210 215 220 

55 Val Phe Val Asp His Pro Ser Tyr His Arg Pro Gly Ser Leu Tyr Gly 

225 230 235 240 



Asp Asn Phe Gly Ala Phe Gly Asp Asn Gin Phe Arg Tyr Thr Leu Leu 
245 250 255 

Cys Tyr Ala Ala Cys Glu Ala Pro Leu He Leu Glu Leu Gly Gly Tyr 
260 265 270 
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lie Tyr Gly Gin Asn Cys Met Phe Val Val Asn Asp Trp His Ala Ser 
275 280 . 285 

Leu Val Pro Val Leu Leu Ala Ala Lys Tyr Arg Pro Tyr Gly Val Tvr 
290 295 300 

Arg Asp Ser Arg Ser Thr Leu Val He His Asn Leu Ala His Gin Gly 
305 310 315 320 

Leu Glu Pro Ala Ser Thr Tyr Pro Asp Leu Gly Leu Pro Pro Glu Trp 
325 330 335 

Tyr Gly Ala Leu Glu Trp Val Phe Pro Glu Trp Ala Arg Arg His Ala 
15 340 345 350 

Leu Asp Lys Gly Glu Ala Val Asn Phe Leu Lys Gly Ala Val Val Thr 
355 360 365 

20 Ala As P Ar S He Val Thr Val Ser Gin Gly Tyr Ser Trp Glu Val Thr 

370 375 380 

Thr Ala Glu Gly Gly Gin Gly Leu Asn Glu Leu Leu Ser Ser Arg Lys 
385 390 395 400 

Ser Val Leu Asn Gly He Val Asn Gly He Asp He Asn Asp Trp Asn 
405 410 415 

on Pro Thr Thr As P L ys Cys Leu Pro His His Tyr Ser Val Asp Asp Leu 

30 420 425 430 

Ser Gly Lys Ala Lys Cys Lys Ala Glu Leu Gin Lys Glu Leu Gly Leu 
435 440 445 

Pro Val Arg Glu Asp Val Pro Leu He Gly Phe He Gly Arg Leu Asp 
450 455 460 

Tvr Gin Lys Gly He Asp Leu He Lys Met Ala He Pro Glu Leu Met 
465 470 475 480 

Arg Glu Asp Val Gin Phe Val Met Leu Gly Ser Gly Asp Pro He Phe 
485 490 495 

AC Glu G± y Tr P Met Ar 9 Ser Thr Glu Ser Ser Tyr Lys Asp Lys Phe Arg 

4b 500 505 510 

Gly Trp Val Gly Phe Ser Val Pro Val Ser His Arg He Thr Ala Gly 
515 520 525 

50 cys Asp He Leu Leu Met Pro Ser Arg Phe Glu Pro Cys Gly Leu Asn 

530 535 540 

Gin Leu Tyr Ala Met Gin Tyr Gly Thr Val Pro Val Val His Gly Thr 
_5 S 545 550 555 560 

Gly Gly Leu Arg Asp Thr Val Glu Thr Phe Asn Pro Phe Gly Ala Lys 
565 570 575 

Gly Glu Glu Gly Thr Gly Trp Ala Phe Ser Pro Leu Thr Val Asp Lys 
60 580 585 590 

Met Leu Trp Ala Leu Arg Thr Ala Met Ser Thr Phe Arg Glu His Lys 
595 600 605 



35 



40 
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Pro Ser Trp Glu Gly Leu Met Lys Arg Gly Met Thr Lys Asp His Thr 
610 615 620 

Trp Asp His Ala Ala Glu Gin Tyr Glu Gin lie Phe Glu Trp Ala Phe 
625 630 635 640 

Val Asp Gin Pro Tyr Val Met 
645 



(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5072 base pairs 

(B) TYPE: nucleic acid 

1 5 (C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

2 0 (iii) HYPOTHETICAL: NO 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: tnticum tauschii 
. (F) TISSUE TYPE: Endosperm 
2 5 (ix) FEATURE: 

(A) NAME/KEY: promoter 

(B) LOCATION: 1.. 4993 

(D) OTHER INFORMATION:/function= "region containing 
promoter of SSS I" 

30 



35 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 


15: 








TCTAGATGCA 


TGCTGGATAG 


CGGTCGATGT 


GTGGAGTAAT 


AGTAGTAGAT 


GCAGAATCGT 


60 


TTCGGTCTAC 


TTGTCGCGGA 


CGTGATGCCT 


ATATACATGA 


TCATACCTAG 


ATATTCTCAT 


120 


AACTATGCTC 


AATTCTATCA 


ATTGCTCGAC 


AGTAATTCGT 


TTACCCACCG 


TAATACTTAT 


180 


GATCTTGAGA 


GAAGTCACTA 


GTGAAACCTA 


TGCCCCCCAG 


GTCTATTTTG 


CATCATATTA 


240 


ATCTTCCAAT 


ACTTAGTTAT 


TTCCATTGCC 


GTTTATTTTA 


CTTTGTATCT 


TTATTTCTTT 


300 


TTATTATAAA 


AAATACCAAA 


AATATTATCT 


TATCATATCT 


ATCAGATCTC 


ATTCTCGTAA 


360 


GTGACCGTGA 


AGGGATTGAC 


AACCCCTTTA 


TCGTGTTGGT 


TGCGAGGTTC 


TTGTTTGTTT 


420 


GTGTAGGTGC 


GTGTGACTCG 


CACGTCTCCT 


ACTGGATTGA 


TACCTTGGGT 


TTTCAAAAAC 


480 


TGAGAAAAAT 


ACTTACGCTA 


CTTTACTGCA 


TAACCCTTTC 


CTCTTTAAAA 


AAAAAAACCA 


540 


ACGTAGTATT 


CAAGAGGTAG 


CACGCTACCA 


TCCTCTCCAA 


CAGGAGCGCG 


GAGATCTTTG 


600 


TCCGGCAGGT 


TGATGCGGGC 


CGGGGAAGAA 


CTCCAGCTGC 


CTTGGCCAGC 


TTGGTCGTGA 


660 


GCCGCCCCAG 


CGGCGTCTTG 


AACCTGTCCA 


CGTAGCGCTC 


CCTGACACGC 


GGCGTGAACT 


720 


GAGAAGGCTT 


GTCGATGAAC 


TCCAGCTGTT 


GTGCCAGCCT 


AGCTTGCGCC 


TTCTTCTGCT 


780 


GGGTCATGCC 


CTTCGAGAAA 


CCCACCTTGG 


CCACCCTTGT 


GCTTGAGCGG 


CGCGCCACCT 


840 


CAGCAGGCGG 


CGGCGTGGGG 


ATGAAGAGGG 


TGTCTGCTTC 


CGGAGCAGGC 


GGGTCGGCGT 


900 
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TGAACTTGAA 


AGGCGGTGGC 


CCCATGATGG 


ATGGGGGGAG 


CATGCCAAAG 


ACTTGGTTGA 


960 




GGAAAGTGGT 


GTTGGCGTCC 


ACCTCCAGTG 


CCTGCAGTTT 


GGAAGCCAGA 


CGATTGGCGT 


1020 


D 


CGATCTCTGG 


CTCCGGCTGG 


AAGGAGGCTC 


GACGCTCCGG 


TGTGCCAGAA 


CGCAAAGGGA 


1080 




GGAGCGGCAG 


CTC TGGCTG A 


GCAGACCCCG 


CGCCCATGTA 


CTCTGCATTG 


GGCCAAGGCT 


1140 


10 


GCAGGGGCAA 


GCCACCGGGA 


TGGGGGCGCG 


AGGTGGACTG 


CGCACCGGAG 


GAAGGCCAAG 


1200 


CTCAACCTCG 


GTGAGGTTCG 


CCCCAGACCA 


GGGCGGCAGG 


CTCGGGTCCA 


CAAAGGGCCA 


1260 




a )v /"i r"^ m 

AACCGCCTCG 


TCCGCCCCGA 


AACTGTCCAG 


GACAGACGGC 


GGACGACGGA 


AGGCCGTGTC 


1320 




GTCGAGCTCG 


AGCAGCAGAG 


GGTCCGTGCG 


GGTGATGTCT 


TGCCAAATGG 


ACTCCACCTC 


1380 




CAGCAGGAAG 


GGGGACTGGT 


CCATCGCCCC 


TGGCCAAGCC 


ACTGGTACGC 


CAAAGATGGC 


1440 


20 


ATCAGCAGCG 


TTTGCACCAG 


GGGGAGCAGC 


CACACCTTGG 


AGGACAGGGA 


GGGTGCGGAC 


1500 


GTCGACGGCA 


GCAAAACGTG 


GCTGGAGCAA 


GTTGCCGTCG 


CGTGCCGGCC 


TCGGCGAGCG 


1560 




CGAGCGGCTG 


TAGGAGCGCT 


CGGTGCCCTC 


AGACTCGGAC 


AGTGCGCCAG 


TGGGAGAGCC 


1620 




ATGGCGACGC 


CGGCCACCAC 


TGGACGTGCC 


ATGGCGCTGG 


TCCTGACGGC 


GCCTGGATGG 


1680- 




CCCGTCCTCG 


CGGGCAGCTC 


CACCTGAGCG 


GCACCCGAGG 


AGCACACCCC 


GCCAAGCTGG 


1740 


30 


GCCAGGGCGG 


CTGCGGCGAC 


GGCGACGGCC 


GCGGTCGCGG 


TC TG C AC CAT 


CATCTTCATC 


1800 


TTCGTCATCG 


TGGCGCCTCG 


GACAAGGATG 


CTCGCTGTCA 


CCGACGCGAG 


GGACGTGAGC 


1860 




CGGCTCAGCC 


CGCCCTTCCT 


CGACGTGGCG 


AGCCCTGCGG 


ATATGCTCCT 


CGAGCGGCCA 


1920 


-5 D 


TTGGGGGTCG 


TTGGCGCGCG 


GCATCTCGGG 


GTCGCGGTCA 


GCTATCGGGG 


TGTAGTCCTT 


1980 




TGTGGTGTCC 


AGGTGGATGA 


GCAGAGAGAA 


ATCCGGCCCC 


TCTAGCCCCT 


CGTCCCGGGG 


2040 


40 


GCAGCCCTCC 


GGCAGCGTCT 


GGCGGCCCCT 


GGGGTCCAGG 


GGTCGATCGA 


TGATGGAGAA 


2100 


CCCCCTTTTG 


GTGGGGATGT 


CGTCCGGACT 


CCATGCCCAC 


ACCCAGGCAA 


AGAGGCAGGC 


2160 




CGTGTTGGAG 


AGGGAGGTCG 


TCTGCCGCTC 


CAACCAGTCG 


ACGTGGCATG 


TCTTCCCGAG 


2220 




CGCATCCTGC 


CCCGCCTCCT 


TGTTCCAGGA 


CTGCACCGGC 


ATGTTCTCGA 


CGGCGATGCG 


2280 




GCAGTAGTAC 


CGCCAGACAC 


GGCGGTGGCC 


GTGTGCCGAT 


GGTGACCAGG 


CCGACAGGGA 


2340 


50 


GAGCGCGACG 


CCCCAGCAGG 


AGACGACCCC 


AGCGTCGAAA 


GCGATGTCCC 


GGTGCCTGAA 


2400 


L> I (jLjAC_GAGC 


CCAGAGATGG 


CCAGGCGCAT 


TGACGCGGGG 


AAGGGGAAGG 


AGTTAGGATG 


2460 




<jL»v-GA(_GQ.GG 


CCGGAGTGAA 


CCGCGGCGTG 


GTGGCCGACG 


GGGCTGGAGA 


GGCAGAGGCG 


2520 


— > ~> 


UAG TCATCCG 


AGAGAGGTGT 


ATCAGTGGCT 


CTGCACAATA 


CCCAGTGTCG 


CCACATCATA 


2580 




lCv_ IGCTGAA 


TAACCACACA 


TGTGTACTGT 


CGTTAAATAA 


ATCATTGGTC 


ACGCGAACCC 


2640 


60 


. GGAAAAAGAC 


GGCGAAAAAT 


TCACGGACAC 


ACGACTAGTA 


GTACCCAATA 


TACTCGGCAA 


2700 




AAACAGTGAC 


ACGTCGTTTT 


GCGTTGTCGG 


CCGGTGTTGT 


CGAGTCATTG 


TACTATGTTT 


2760 




TGTCGTTTCT 


TTCTTTTCTC 


CAAATCGACA 


AACCGTTTGT 


CTTTGGTTAA 


AAAACAGAAA 


2820 


65 


CATACAAAAT 


CAAATGAATG 


CATTCAAGGG 


CCGGTAATCC 


AATTCTGAGC 


CC AGGCTC AG 


2880 




CTACACCCGC 


CCTTACAAAA 


AAATCAAAAT 


AAATACTAGA 


AAAATTCAAA 


AAATTCCAAT 


2940 
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TTGTTTGTGC GTGGTAGATA ATTTGATGCG TGAGGTACGC TTCAATTTTC AAATTATTTG 3 000 
GACATCTGAG CAGCTCTCAG CAAAAAAGAC AAATTCGGGG TCTGTAAAAA TGTTTACTGT 3 0 60 

5 

TCATGCACTG TTCTGACCCG ATTTGTCTTT TTTGCTGAGA GCTTCTCAGA AGTCCAAATG 312 0 
AGCTAAAATT TTGAGCGGAG CTTACGTGAT AAAATGTCTA TCATGCAAAA AAGGATTGGA 3180 

10 ATTTTTTGAA TTTTTTTTAT TTTTTGTGAT TTGTTTCCTG GACGGGTGCA GATAAGCCTG 3 240 
GGCACCGAAA CGCCGCACTC AGGCTCATCC TTTTCTATAA AAGAAAAGAA ATACATACAA 33 00 
TTTCCCTCTG TTTTTTGAGC AAGGGGCACC ACCCACCAAA GAGTTTTCAA CTCACATGGT 3 3 60 

15 ATTAGAGCAT CTACAGCCGG GCGTCTCAAA CCAGCCTCAT ACGCTTGAGC GGGTCGCCTT 3 42 0 
GGTCACGATT TTTTGACCCA GACGGGCCCC TCAAACGGTC CTTAAACGCC CAGGCTGACC 3 4 80 

2 0 GACAACCCAC ATATCCAGCC CAAATATGGG GTGGATATGG GGGCGCCCGG GCACGCCAGC 3 5 40 

CCGCGGACAC CACACATCTT CAGTTTCTAA TTTGAGATAT CCGGATGTGG AATGCGTTTT 3 600 
TGAGGGGTGA CCGGTCCCTG TCCGTGGATG CGCCCGGACG TTTGAGGGGT TGGATTTGCC 3 660 

25 

AAGTCTGATT AGAGATGCTC TTAGGTGTTC CACCCCCATC CCTTGATGGC TAGGGCAAAC 37 2 0 
TCTCCCCTCC AAACTTTGTC GGCGAGCCTG TGGATTCTTC TCTCCTCTGC CCGCTGCTCC 37 8 0 

3 0 GGCGGCTGAT GGCGGGGAGG AGAATCCCGG TGTCTTCGCT TGGTTAGTTG TTTAAGTTAC 3 840 

GTACTTTTTT AGTCCTCGCA GGTGCGGCGT TCGGACGTAT GGTCGTGCTT CTTTTTTGAG 3900 
TTTGTCTTCC GGGCTCTGAT CCTCCTCGAG TTCGTCCATC TGGACGTACT CGACGGAGCT 3 9 60 

35 

CCGGCATAGA TTCCTATCAT CGTCTTGGTG AGGTGAGGTT ATGGTTTCTT GTCATGTGGG 402 0 
CAGATTTGGT GCCAGATGCT TCATATCTAT TCAAGGGTTC AGCGGCAACA ACTGCGGCTC 408 0 

4 0 CAGAGCGATG GTCCTTAAGG GCACGTGCAC GAAGACTTCA CGGCTGTTAT CGACAAGGTC 414 0 

AAGCCGGCTC CGATAGGGGA GCAGCGACAG CGGCGCGTCA ACCGCTCGTT CTGGCGGCAG 420 0 
TAGTGGTCGT TCGGTGCTCT CGGAACCTCG ATGTAATTTT TATGATTTTA GAGATGCTTT 42 60 

45 

GTACTTCCGA TCGATGAACT CTGATAATAG ATATCTCTTC TCTCGCAAAA AAAGAGAGTT 43 2 0 
TTCAACTGAA AACAAAAGAG TTTCACTAGT TCTTCTTTTA GAAACAGAGT TTCACTAGCA 43 80 
50 CTTTTTTTTG CGAGAAGTCG AGTTTCACTA AGTACTAAAC CCACGCAATT ATTCTCAAAA 444 0 
AAAAAACCCA CGCAACTGTC TGGATCCATC TTCGTTTTTT CCCCGAGAAT CGTCTGGATC 450 0 
CATTTTCGTG TGCGAGGCAT CCTCTCATTT TGCACGGCCC AGCTCTCTTC TCGCCGGCGT 4560 

55 

ACGCTGCTAC ATGTCGGCAC TCCACGCAAA CAAAAAGAAG CCCAACCGAA AACGCACGCG 462 0 
CCTTTCCAGG CTCACCACGG AAAAAAATAC CACGCGCCGC TCACGAGCAA ACCGTGACAA 4680 
60 CAGCCAGCCA GATATGGCAA CGGAGGCACG GGCCGCACAC AGCCACTGAA AACCGCAGCT 47 40 
GCTCTTCCGT CCGTCCGTCC CTCCGCCCGT CCGCGCCACT CCACTCGCCT TGCCCCACTC 4800 
CCACTCTTCT CTCCCCGCGC ACACCGAGTC GGCACCGGCT CATCACCCAT CACCTCGGCC 4860 

65 

TCGGCCACCG GCAAACCCCC CGATCCGCTT TTGCAGGCAG CGCACTAAAA CCCCGGGGAG 4920 
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CGCGCCCCGC GGCAGCAGCA GCACCGCAGT GGGAGAGAGA GGCTTCGCCC CGGCCCGCAC 4980 
CGAGCGGGGC GATCCACCGT CCGTGCGTOC GCACCTCCTC CGCCTCCTCC CCTGTCCCGC 5040 
5 GCGCCCACAC CCATGGCGGC GACGGGCGTC GG 5072 

(2) INFORMATION FOR SEQ ID NO: 16: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 1706 base pairs 
10 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



15 



45 



50 



(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 



(vi) ORIGINAL SOURCE: 
(A) ORGANISM: triticum tauschii 
2 0 (F) TISSUE TYPE: Endosperm 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: L. 1706 

2 5 (D) OTHER INFORMATION:/product= "partial cDNA for 

hexaploid wheat DBE M 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

3 0 GCT GTG TCG AAG CTT GAC TAT TTG AAG GAG CTT GGA GTT AAT TGT ATT 

Ala Val Ser Lys Leu Asp Tyr Leu Lys Glu Leu Gly Val Asn Cys He 
1 5 io i 5 

GAA TTA ATG CCC TGC CAT GAG TTC AAC GAG CTG GAG TAC TCA ACC TCT 96 
Glu Leu Met Pro Cys His Glu Phe Asn Glu Leu Glu Tyr Ser Thr Ser 
20 25 30 



-.^^ i^v. ^ x rur\\- i IV. All X 

40 3 5 ^ ^ G — Tyr Thr Ile ASn Phe Phe Ser 



48 



TCT TCC AAG ATG AAC TTT TGG GGA TAT TCT ACC ATA AAC TTC TTT TCA 144 

ly Tyr Ser Thr Ile Asn 
40 45 



CCA ATG ACG AG A TAC AC A TCA GGC GGG ATA AAA AAC TGT GGG CGT GAT 192 
Pro Met Thr Arg Tyr Thr Ser Gly Gly Ile Lys Asn Cys Gly Arg Asp 
50 55 60 

GCC ATA AAT GAG TTC AAA ACT TTT GTA AGA GAG GCT CAC AAA CGG GGA 240 
Ala Ile Asn Glu Phe Lys Thr Phe Val Arg Glu Ala His Lys Arg Glv 
65 ' 70 75 80 

ATT GAG GTG ATC CTG GAT GTT GTC TTC AAC CAT ACA GCT GAG GGT AAT 
Ile Glu Val Ile Leu Asp Val Val Phe Asn His Thr Ala Glu Gly Asn 

-85—— 9 0 — — — — '—95— — 



288 



GAG AAT GGT CCA ATA TTA TCA TTT AGG GGG GTC GAT AAT ACT ACA TAC 3 36 
Glu Asn Gly Pro Ile Leu Ser Phe Arg Gly Val Asp Asn Thr Thr Tyr 
100 105 no 

TAT ATG CTT GCA CCC AAG GGA GAG TTT TAT AAC TAT TCT GGC TGT GGG 3 84 
Tyr Met Leu Ala Pro Lys Gly Glu Phe Tyr Asn Tyr Ser Gly Cys Gly 
60 US 120 125 
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AAT ACC TTC AAC TGT AAT CAT CCT GTG GTT CGT CAA TTC ATT GTA GAT 43 2 
Asn Thr Phe Asn Cys Asn His Pro Val Val Arg Gin Phe lie Val Asp 
130 135 140 

5 TGT TTA AGA TAG TGG GTG ATG GAA ATG CAT GTT GAT GGT TTT CGT TTT 4 80 
Cys Leu Arg Tyr Trp Val Met Glu Met His Val Asp Gly Phe Arg Phe 
145 150 155 160 

GAT CTT GCA TCC ATA ATG ACC AGA GGT TCC AGT CTG TGG GAT CCA GTT 528 
10 Asp Leu Ala Ser lie Met Thr Arg Gly Ser Ser Leu Trp Asp Pro Val 

165 170 175 

AAC GTG TAT GGA GCT CCA ATA GAA GGT GAC ATG ATC ACA AC A GGG AC A 57 6 
Asn Val Tyr Gly Ala Pro lie Glu Gly Asp Met lie Thr Thr Gly Thr 
15 180 185 190 

CCT CTT GTT ACT CCA CCA CTT ATT GAC ATG ATC AGC AAT GAC CCA ATT 62 4 

Pro Leu Val Thr Pro Pro Leu lie Asp Met He Ser Asn Asp Pro He 

195 200 205 

20 

CTT GGA GGC GTC AAG CTC ATT GCT GAA GCA TGG GAT GCA GGA GGC CTC 67 2 

Leu Gly Gly Val Lys Leu He Ala Glu Ala Trp Asp Ala Gly Gly Leu 

210 215 220 

2 5 TAT CAA GTA GGT CAA TTC CCT CAC TGG AAT GTT TGG TCT GAG TGG AAT 72 0 

Tyr Gin Val Gly Gin Phe Pro His Trp Asn Val Trp Ser Glu Trp Asn 
225 230 235 240 

GGG AAG TAC CGG GAC ATT GTG CGC CAA TTC ATT AAA GGC ACT GAT GGA 76 8 

3 0 Gly Lys Tyr Arg Asp He Val Arg Gin Phe He Lys Gly Thr Asp Gly 

245 250 255 

TTT GCT GGT GGT TTT GCC GAA TGT CTT TGT GGA AGT CCA CAC CTA TAC 816 
Phe Ala Gly Gly Phe Ala Glu Cys Leu Cys Gly Ser Pro His Leu Tyr 
35 260 265 270 

CAG GCA GGA GGA AGG AAA CCT TGG CAC AGT ATC AAC TTT GTA TGT GCA 864 
Gin Ala Gly Gly Arg Lys Pro Trp His Ser He Asn Phe Val Cys Ala 
275 280 285 

40 

CAT GAT GGA TTT ACA CTG GGT GAT TTG GTA ACA TAT AAT AAC AAG TAC 912 
His Asp Gly Phe Thr Leu Gly Asp Leu Val Thr Tyr Asn Asn Lys Tyr 
290 295 300 

45 AAT TTA CCA AAT GGG GAG AAC AAT AGA GAT GGA GAA AAT CAC AAT CTT 9 60 
Asn Leu Pro Asn Gly Glu Asn Asn Arg Asp Gly Glu Asn His Asn Leu 
305 310 315 320 

AGC TGG AAT TGT GGG GAG GAA GGA GAA TTC GCA AGA TTG TCT GTC AAA 100 8 
50 Ser Trp Asn Cys Gly Glu Glu Gly Glu Phe Ala Arg Leu Ser Val Lys 

325 330 335 

AGA TTG AGG AAG AGG CAG ATG CGC AAT TTC TTT GTT TGT CTC ATG GTT 1056 
Arg Leu Arg Lys Arg Gin Met Arg Asn Phe Phe Val Cys Leu Met Val 
55 340 345 350 

TCT CAA GGA GTT CCA ATG TTT TAC ATG GGC GAT GAA TAT GGC CAC ACA 1104 
Ser Gin Gly Val Pro Met Phe Tyr Met Gly Asp Glu Tyr Gly His Thr 
355 360 365 

60 

AAA GGG GGC AAC AAC AAT ACA TAC TGC CAT GAT TCT TAT GTC AAT TAT 1152 
Lys Gly Gly Asn Asn Asn Thr Tyr Cys His Asp Ser Tyr Val Asn Tyr 
370 375 380 



BNSDOCID: <WO 9914314A1_I_> 



WO 99/14314 



PCT/AU98/00743 



- 102 - 



10 



TTT CGC TGG GAT AAA AAA GAA CAA TAC TCT GAC TTG CAC AGA TTC TGC 1200 
Phe Arg Trp Asp Lys Lys Glu Gin Tyr Ser Asp Leu His Arg Phe Cvs 
385 390 395 400 

TGC CTC ATG ACC AAA TTC CGC AAG GAG TGC GAG GGT CTT GGC CTT GAG 12 48 
Cys Leu Met Thr Lys Phe Arg Lys Glu Cys Glu Gly Leu Gly Leu Glu 
405 410 415 

GAC TTT CCA ACG GCC GAA CGG CTG CAG TGG CAT GGT CAT CAG CCT GGG 12 96 
Asp Phe Pro Thr Ala Glu Arg Leu Gin Trp His Gly His Gin Pro Gly 
420 425 430 

AAG CCT GAT TGG TCT GAG AAT AGC CGA TTC GTT GCC TTT TCC ATG AAA 13 44 
lb Lys Pro Asp Trp Ser Glu Asn Ser Arg Phe Val Ala Phe Ser Met Lys 
435 440 445 

GAT GAA AGA CAG GGC GAG ATC TAT GTG GCC TTC AAC ACC AGC CAC TTA 13 92 
Asp Glu Arg Gin Gly Glu He Tyr Val Ala Phe Asn Thr Ser His Leu 
20 450 455 460 

CCG GCC GTT GTT GAG CTC CCA GAG CGC GCA GGG CGC CGG TGG GAA CCG 1440 
Pro Ala Val Val Glu Leu Pro Glu Arg Ala Gly Arg Arg Trp Glu Pro 
465 470 475 480 

GTG GTG GAC AC A GGC AAG CCA GCA CCA TAT GAC TTC CTC ACC GAC GAC 148 8 
Val Val Asp Thr Gly Lys Pro Ala Pro Tyr Asp Phe Leu Thr Asp Asp 
485 490 495 

3 0 TTA CCT GAT CGC GCT CTC ACC ATA CAC CAG TTC TCT CAT TTC CTC AAC 153 6 
Leu Pro Asp Arg Ala Leu Thr He His Gin Phe Ser His Phe Leu Asn 
500 505 510 

TCC AAC CTC TAC CCC ATG CTC AGC TAC TCA TCG GTC ATC CTA GTA TTG 1584 
Jb Ser Asn Leu Tyr Pro Met Leu Ser Tyr Ser Ser Val He Leu Val Leu 
515 520 525 

CGC CCT GAT GTT TGA GAG ACA AAT ATA TAC AGT AAA TAA TAT GTC TAT 163 2 
Arg Pro Asp Val * Glu Thr Asn He Tyr Ser Lys * Tyr Val Tvr 
40 530 535 54Q 



25 



45 



60 



ATG TAG TCC TTT GGC GTA TTA TCA GTG TGC ACA ATT GCT CTA TTG CCA 1680 
Met * Ser Phe Gly Val Leu Ser Val Cys Thr He Ala Leu Leu Pro 
545 550 



555 560 



GTG ATC TAT TCG ATA GCG GCC GCG AA 
Val He Tyr Ser He Ala Ala Ala 
565 



1706 



5 0 (2) INFORMATION FOR SEQ ID NO: 17: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9289 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS:-singlc 



55 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(vi) ORIGINAL SOURCE: 
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(A) ORGANISM: triticum tauschit 
(F) TISSUE TYPE: Endosperm 

(ix) FEATURE: 
5 (A) NAME/KEY: CDS 

(B) LOCATION: L.9289 

(D) OTHER INFORMATION:/product= "genomic sequence of DBE" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

10 

CGG GAC CGT CCC TTG GCA ACT TGG GTT ACG TTG GGA CCT GAC GCT TCG 48 
Arg Asp Arg Pro Leu Ala Thr Trp Val Thr Leu Gly Pro Asp Ala Ser 
570 575 580 

15 CTT ATC CGG TGT GCC CTG AGA CGA GAT ATG TGC AGC TCC TAT CGG ATT 9 6 
Leu lie Arg Cys Ala Leu Arg Arg Asp Met Cys Ser Ser Tyr Arg lie 
585 590 595 600 

TGT CGG CAC ATT CGG CGG CTT TGC TGG TCT TGT TTT ACC ATT GTC GAA 144 
2 0 Cys Arg His lie Arg Arg Leu Cys Trp Ser Cys Phe Thr lie Val Glu 

605 610 615 

ATG TCT TAT AAA CCG GGA TTC CGA GAC TGA TCG GGT CTT CCC GGG AGA 192 
Met Ser Tyr Lys Pro Gly Phe Arg Asp * Ser Gly Leu Pro Gly Arg 
25 620 625 630 

AGG TTT ATC CTT CGT TGA CCG TGA GAG CTT ATA ATG GGC TAA GTT GGG 240 
Arg Phe lie Leu Arg * Pro * Glu Leu lie Met Gly * Val Gly 
635 640 645 



30 



50 



ACA CCC CTG CAG GGT ATT ATC TTT CGA AAG CCG TGC CCG CGG TTA TGA 2 88 
Thr Pro Leu Gin Gly lie lie Phe Arg Lys Pro Cys Pro Arg Leu * 
650 655 660 



3 5 GGC AGA TGG GAA TTT GTT AAT GTC CGA TTG TAG AGA ACC TGT CAC TTG 33 6 

Gly Arg Trp Glu Phe Val Asn Val Arg Leu * Arg Thr Cys His Leu 
665 670 675 680 

ACT TAA TTT AAA ATT CAT CAA CCG TGT GTG TAG CCG TGA TGG TCT CTT 3 84 

4 0 Thr * Phe Lys lie His Gin Pro Cys Val * Pro * Trp Ser Leu 

685 690 695 

TTC GGC GGA GTC CGG GAA GTG AAC ACG GTT TGA GTT ATG CAT GAA CGT 43 2 
Phe Gly Gly Val Arg Glu Val Asn Thr Val * Val Met His Glu Arg 
45 700 705 710 

AAG TAG TTT CAG GAT CAC TCC TTG ATC ACT TCT AGC TCC GCG ACC GTT 480 
Lys * Phe Gin Asp His Ser Leu lie Thr Ser Ser Ser Ala Thr Val 
715 720 725 



GCG TTG TTT CTC TTC TCG CTC TCA TTT GCG TAT GTT AGC CAC CAT ATA 52 8 
Ala Leu Phe Leu Phe Ser Leu Ser Phe Ala Tyr Val Ser His His lie 
730 735 740 



55 TGC TTA GTG TCT GCT GCA GCT CCA CCT CAT TAC CCC TTC CTT TCC TAT 57 6 
Cys Leu Val Ser Ala Ala Ala Pro Pro His Tyr Pro Phe Leu Ser Tyr 
745 750 755 760 

AAG CTT AAA TAG TCT TGA TCT CGC GGG TGT GAG ATT GCT GAG TCC TCG 62 4 
60 Lys Leu Lys * Ser * Ser Arg Gly Cys Glu lie Ala Glu Ser Ser 

765 770 775 
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TGA CTT ACA GAT TCT ACC AAA ACA GTT GCA GGT GTC GAC GAT GCC AGT 672 
* Leu Thr Asp Ser Thr Lys Thr Val Ala Gly Val Asp Asp Ala Ser 
780 785 790 

5 GCA GGT GAC GCA ACC GAG CTC AAG TGG GAG TTC GAC GAG GAA CGT GGT 720 
Ala Gly Asp Ala Thr Glu Leu Lys Trp Glu Phe Asp Glu Glu Arg Gly 
795 800 805 

CGT TAC TAT GTT TCT TTT CCT GAT GAT CAG TAG TGG AGC CCA GTT GGG 768 
10 Arg Tyr Tyr Val Ser Phe Pro Asp Asp Gin * Trp Ser Pro Val Gly 
810 815 820 



20 



40 



50 



ACG ATC GGG GAT CTA GCA TTT GGG GTT ATC TTA ATT TCT TTT AGA TTT 816 
Thr He Gly Asp Leu Ala Phe Gly Val He Leu He Ser Phe Arg Phe 
825 330 • 835 840 



45 TCG GCA AGA AAG CGA AGC TTC ACG GCA CAC CTT CAT GAA GCC TCT CTG 
Ser Ala Arg Lys Arg Ser Phe Thr Ala His Leu His Glu Ala Ser Leu 
955 960 965 



GAC CGT AAT CGG TCT ATG TGT GGA TTT TGG ATG ATG TAT GAA TTA TTT 864 
Asp Arg Asn Arg Ser Met Cys Gly Phe Trp Met Met Tyr Glu Leu Phe 
SfS 850 855 

ATG TAT TGT GTG AAG TGG CGA TTG TAA GCC AAC TCT CGT TAT CCC ATT 912 
Met Tyr Cys Val Lys Trp Arg Leu * Ala Asn Ser Arg Tyr Pro He 
860 865 870 

2 5 CTT GTT CAT TAC ATG GGA TTG TGT GAA GAT GAC CCT TCT TGC GAC AAA 960 
Leu Val His Tyr Met Gly Leu Cys Glu Asp Asp Pro Ser Cys Asp Lys 
875 880 885 

ACC ACA ATG CGG TTA TGC CTC TAA GTC GTG CCT CGA CAC GTG GGA GAT 1008 
JU Thr Thr Met Arg Leu Cys Leu * Val Val Pro Arg His Val Glv Asd 
89 0 895 goo 

ATA GCC GCA TCG TGG GCG TTA CAC GCA AGT CTT CAT AGC AAC CAA AAC 1056 
-j q Ala Ala Ser Tr P Al£ » Leu His Ala Ser Leu His Ser Asn Gin Asn 

35 905 910 915 9 20 

TCC TCT CCG CAT TAC AAG CCA CCA ATC GCA GCC ACC ATG ACT TTC TTC 1104 
Ser Ser Pro His Tyr Lys Pro Pro He Ala Ala Thr Met Thr Phe Phe 
9 25 930 935 

ACC ACT GTC AAT GCC ATG AAA ATC TAT ATG TAG ACA TGT CCC ATT GCA 1152 
Thr Thr Val Asn Ala Met Lys He Tyr Met * Thr Cys Pro He Ala 
940 945 g50 



1200 



GCC GAA GAC AAG GAT GCG CCC GAC CGG ATC AAT TCC TAT CTA GAT ACC 1248 
Ala Glu Asp Lys Asp Ala Pro Asp Arg He Asn Ser Tyr Leu Asp Thr 
970 975 980 



TAG TGG AGC CAT GCG CCA ATA GCG GAG ATC TCC GAG AGG AAG ACC GGA 1296 

_«-.,- * Trp Ser His Al a Pro Ala Gl u He Se r Glu Arg r.y* Thr r Tly 

ib 985 "0 995 2.000 

ACT CGT CGG ACG TCG GCG TCC AAA TCG AGG AGG CCG GCA TGA AGC ACA 1344 
Thr Arg Arg Thr Ser Ala Ser Lys Ser Arg Arg Pro Ala * Ser Thr 
50 1005 1010 10 15 

TCG AGG ATG GTG ATC CCC ATA CGG GTA GAT CGG GTC GGC CGC CAT CTC 1392 
Ser - Arg Met Val He Pro He Arg Val Asp Arg Val Gly Arg His Leu 
1020 io25 1030 
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ACA CCG AG A TTA GGA TGC TTA AAA CGG TTT TTT TGG CAC TAG CAT TAT 144 0 

Thr Pro Arg Leu Gly Cys Leu Lys Arg Phe Phe Trp His * His Tyr 

1035 1040 1045 

5 TTT GCA TCA TCC GTT GGA GAG AAC ATG AGA GAG CCC CAT TTC TTC CAC 14 8 8 
Phe Ala Ser Ser Val Gly Glu Asn Met Arg Glu Pro His Phe Phe His 
1050 1055 1060 

GGT TCT ACC TAT GGG ATC TTG TTC TGC TTG CAA CCG GGC CTC ACG GAA 153 6 
10 Gly Ser Thr Tyr Gly lie Leu Phe Cys Leu Gin Pro Gly Leu Thr Glu 
1065 1070 1075 1080 

AAC CCG CGC CAG CGG ACC CAC CCC ATG CTA GCA GGG CAC GGC ACC CGC 158 4 
Asn Pro Arg Gin Arg Thr His Pro Met Leu Ala Gly His Gly Thr Arg 
15 1085 1090 1095 

AGC GGC CGG TCC AAA TGG ACG GTG AGA ACC GCA ACG CGA CAC GCC CGG 163 2 

Ser Gly Arg Ser Lys Trp Thr Val Arg Thr Ala Thr Arg His Ala Arg 
1100 1105 1110 

20 

CAC TGT CAG CAA AGC GAG AGC GCG CGC ACG GCA CAC GCA CGC TCG GAC 168 0 

His Cys Gin Gin Ser Glu Ser Ala Arg Thr Ala His Ala Arg Ser Asp 
1115 1120 1125 

2 5 GAA CGG ACG GTG CGA TCG ATC CCT CCC CCC TCG CTC AAC CAC AGT AGT 172 8 

Glu Arg Thr Val Arg Ser lie Pro Pro Pro Ser Leu Asn His Ser Ser 
1130 1135 1140 

ACC CTG CCA CAC TAT CAC GCA CGC ACT CGA GTC ACA CCT CCC ACG AAG 177 6 

3 0 Thr Leu Pro His Tyr His Ala Arg Thr Arg Val Thr Pro Pro Thr Lys 

1145 1150 1155 1160 

AAC CAA CAG GAG GCG CGG ATC CCA CCG ATA AAT AAC CCC GCC TCG CCG 182 4 
Asn Gin Gin Glu Ala Arg lie Pro Pro lie Asn Asn Pro Ala Ser Pro 
35 1165 1170 1175 

CTC CTC CCC AAA ATC AAT CAC CGA TCG CTC GGG GTT CCC GGC ATG ACG 187 2 

Leu Leu Pro Lys lie Asn His Arg Ser Leu Gly Val Pro Gly Met Thr 
1180 1185 1190 

40 

ATG ATG GCC ATG GCC AAG GCG CCC TGC CTC TGC GCG CGC CCG TCC CTC 192 0 

Met Met Ala Met Ala Lys Ala Pro Cys Leu Cys Ala Arg Pro Ser Leu 
1195 1200 1205 

45 GCC GCG CGC GCG AGG CGG CCG GGG CCG GGG CCG GCG CCG CGC CTG CGA 196 8 
Ala Ala Arg Ala Arg Arg Pro Gly Pro Gly Pro Ala Pro Arg Leu Arg 
1210 1215 1220 

CGG TGG CGA CCC AAT GCG ACG GCG GGG AAG GGG GTC GGC GAG GTG TGC 2016 
50 Arg Trp Arg Pro Asn Ala Thr Ala Gly Lys Gly Val Gly Glu Val Cys 
1225 1230 1235 1240 

GCC GCG GTT GTC GAG GCG GCG ACG AAG GCC GAG GAT GAG GAC GAC GAC 2 064 
Ala Ala Val Val Glu Ala Ala Thr Lys Ala Glu Asp Glu Asp Asp Asp 
55 1245 1250 1255 

GAG GAG GAG GCG GTG GCG GAG GAC AGG TAC GCG CTC GGC GGC GCG TGC 2112 
Glu Glu Glu Ala Val Ala Glu Asp Arg Tyr Ala Leu Gly Gly Ala Cys 
1260 1265 1270 



60 



AGG GTG CTC GCC GGA ATG CCC GCG CCG CTG GGC GCC ACC GCG CTC GCC 2160 
Arg Val Leu Ala Gly Met Pro Ala Pro Leu Gly Ala Thr Ala Leu Ala 
1275 1280 1285 
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GGC GGG GTC AAT TTC GCC GTC TAC TCC GGT GGA GCC ACC GCC GCG GCG 2208 
Gly Gly Val Asn Phe Ala Val Tyr Ser Gly Gly Ala Thr Ala Ala Ala 
1290 1295 1300 

5 CTC TGC CTC TTC ACG CCA GAA GAT CTC AAG GCG GTG GGG TTG CCT CCC 2256 
Leu Cys Leu Phe Thr Pro Glu Asp Leu Lys Ala Val Gly Leu Pro Pro 
1305 1310 1315 1320 

GAG TAG AGT TCA TCA GCT TTG CGT GCG CCG CGC GCC CCC TTT TCT GGC 23 04 
10 Glu * Ser Ser Ser Ala Leu Arg Ala Pro Arg Ala Pro Phe Ser Gly 

1325 1330 1335 

CTG CGA TTT AAG TTT TGT ACT GGG GGA AAT GCT GCA GGA TAG GGT GAC 2 3 52 
Leu Arg Phe Lys Phe Cys Thr Gly Gly Asn Ala Ala Gly * Gly Asp 
15 1340 1345 1350 



20 



GGA GGA GGT TTC CCT TGA CCC CCT GAT GAA TCG GAC TGG GAA CGT GTG 2 400 
Gly Gly Gly Phe Pro * Pro Pro Asp Glu Ser Asp Trp Glu Arg Val 
1355 1360 1365 

GCA TGT CTT CAT TGA AGG CGA GCT GCA CGA CAT GCT TTA CGG GTA CAG 244 8 
Ala Cys Leu His * Arg Arg Ala Ala Arg His Ala Leu Arg Val Gin 
1370 1375 1380 

2 5 GTT CGA CGG CAC CTT TGC TCC TCA CTG CGG GCA CTA CCT TGA TAT TTC 2496 

Val Arg Arg His Leu Cys Ser Ser Leu Arg Ala Leu Pro * Tyr Phe 
1385 1390 1395 1400 

CAA TGT CGT GGT GGA TCC TTA TGC TAA GGT GAT CAT ACT TTA GCT TTA 2 544 

3 0 Gin Cys Arg Gly Gly Ser Leu Cys * Gly Asp His Thr Leu Ala Leu 

1405 1410 - 1415 

CCT GCA TCT TGG TAT TTA CAG TAG AAA TTG TTA CGT GGA CCC TTA TTT 2 592 
Pro Ala Ser Trp Tyr Leu Gin * Lys Leu Leu Arg Gly Pro Leu Phe 
35 1420 1425 1430 



40 



GTT GCC TTT TGT GTT GCT CTA GGC AGT GAT AAG CCG AGG GGA GTA TGG 2 640 

Val Ala Phe Cys Val Ala Leu Gly Ser Asp Lys Pro Arg Gly Val Trp 
1435 1440 1445 

CGT TCC GGC GCG TGG TAA CAA TTG CTG GCC TCA GAT GGC TGG CAT GAT 2 68 8 

Arg Ser Gly Ala Trp * Gin Leu Leu Ala Ser Asd Gly Trp His Asp 
1450 1455 1460 

45 CCC TCT TCC ATA TAG CAC GGT ATG CCT GAT TGC TGA AAA TAT TGG CTG 273 6 

Pro Ser Ser lie * His Gly Met Pro Asp Cys * Lys Tyr Trp Leu 

1465 1470 1475 1480 



50 



60 



CAT TTG TTT CTC TCT TTT TCT CAT ATT TTT CTC CTG TCT TTC ACT TGT 2 7 84 
His Leu Phe Leu Ser Phe Ser His lie Phe Leu Leu Ser Phe Thr Cys 
1485 1490 1495 



ACT ACA TTG CCT CAG ACA GTC ATG ATC AAA GAG AGC AGT GTC ATT AGA 2 832 

Thr Thr Leu Pro Gin Thr Val Met lie Lys Glu Ser Ser Val He Arg 

55 - 1500 1505 1510 



CAT TTG TAG TTG TCT GCT GAC TTT GAC CAA AAC TTG TAA TTT ACT GTT 
His Leu * Leu Ser Ala Asp Phe Asp Gin Asn Leu * Phe Thr Val 
1515 1520 1525 



2880 



GTT AAA GGT CCT TGA ATC ATA TTT TTT TAT AAT ATT ATG TTT GCA AGT 2 92 8 
Val Lys Gly Pro * He He Phe Phe Tyr Asn lie Met Phe Ala Ser 
1530 1535 1540 
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GGA AGT- AAA GTG AAA TTG CAT CTA GTA TTT GTT GTT GCT GTC TTA GTC 297 6 

Gly Ser Lys Val Lys Leu His Leu Val Phe Val Val Ala Val Leu Val 
1545 1550 1555 1560 

5 GTT TAA TTG GAC ATG CAG TAA AAA GGT TTG CAT CTG CAG TTT GAT TGG 3024 
Val * Leu Asp Met Gin * Lys Gly Leu His Leu Gin Phe Asp Trp 
1565 1570 1575 

GAA GGC GAC CTA CCT CTA AGA TAT CCT CAA AAG GAC CTG GTA ATA TAT 3072 
10 Glu Gly Asp Leu Pro Leu Arg Tyr Pro Gin Lys Asp Leu Val lie Tyr 
1580 1585 1590 

GAG ATG CAC TTG CGT GGA TTC ACG AAG CAT GAT TCA AGC AAT GTA GAA 312 0 
Glu Met His Leu Arg Gly Phe Thr Lys His Asp Ser Ser Asn Val Glu 
15 1595 1600 1605 

CAT CCG GGT ACT TTC ATT GGA GCT GTG TCG AAG CTT GAC TAT TTG AAG 316 8 
His Pro Gly Thr Phe lie Gly Ala Val Ser Lys Leu Asp Tyr Leu Lys 
1610 1615 1620 



20 



60 



GTA CAG CTG TAC TTG CTG ACT ACA TAG GAT AAT TTT TAA AGA AAG CTA 3216 
Val Gin Leu Tyr Leu Leu Thr Thr * Asp Asn Phe * Arg Lys Leu 
1625 1630 1635 1640 



2 5 CAT ATT AGC CAG AAT TTG GGT TAT TAC AAA AAC TAC TGC ATA CTA TAG 3 2 64 

His lie Ser Gin Asn Leu Gly Tyr Tyr Lys Asn Tyr Cys lie Leu * 
1645 1650 1655 

CAG TTA CAT GCT CAT TAT CGA GGA GAT GCT CAC ACG CAT CTT ATT TGG 3 312 

3 0 Gin Leu His Ala His Tyr Arg Gly Asp Ala His Thr His Leu lie Trp 

1660 1665 1670 

ATT TAA TAC CCA ATT CTG TTT TGA TAT TGG ACT GTT CCC TCT ACA GGA 33 60 
lie * Tyr Pro lie Leu Phe * Tyr Trp Thr Val Pro Ser Thr Gly 
35 1675 1680 1685 

GCT TGG AGT TAA TTG TAT TGA ATT AAT GCC CTG CCA TGA GTT CAA CGA 3408 

Ala Trp Ser * Leu Tyr * lie Asn Ala Leu Pro * Val Gin Arg 
1690 1695 1700 

40 

GCT GGA GTA CTC AAC CTC TTC TTC CAA GTA AGG ACA TGA ATT TAG TAT 3 456 

Ala Gly Val Leu Asn Leu Phe Phe Gin Val Arg Thr * lie * Tyr 

1705 1710 1715 1720 

45 TAG CCT GCC AGC ACT GTT TGA GTG AGA GTT CAT ACA CAT TTT GTG CCT 3 504 
* Pro Ala Ser Thr Val * Val Arg Val His Thr His Phe Val Pro 
1725 1730 1735 

GCA TAA CTG ATA TTT GTT CAA ACT ATT TTT TTT AGC AGT CAC TCA ACA 3 552 
50 Ala * Leu lie Phe Val Gin Thr lie Phe Phe Ser Ser His Ser Thr 
1740 1745 1750 

GTT TTA CAT ATA TAT ATA ATA TAG ACT ATT CGT CAC CCT GGG TGA GGA 3 600 
Val Leu His lie Tyr lie lie * Thr lie Arg His Pro Gly * Gly 
55 1755 1760 1765 

ATA GTT ATT CTT CAC CCA CCT CTA TTT TAA CAT CTA TGC ACC GTA ATT 364 8 
lie Val lie Leu His Pro Pro Leu Phe * His Leu Cys Thr Val lie 
1770 ■ 1775 1780 



TTA CGT TTC GTA AAT TTG TCT TAT TTT AGA GAT AAA AAG AGA ACG TAA 3 69 6 
Leu Arg Phe Val Asn Leu Ser Tyr Phe Arg Asp Lys Lys Arg Thr * 
1785 1790 1795 1800 
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GAA AAC CTA TAA TCG TCG TAA AAA AAA ATA TGT TAC GTA AAA TTA CAA 3744 

Glu Asn Leu * Ser Ser * Lys Lys lie Cys Tyr Val Lys Leu Gin 
1805 1810 1815 

5 ATG TAA AAA CAT AGT GTA AAA TGT AC A TAA AAT AC A TTT TTT GAC CTA 3792 

Met * Lys His Ser Val Lys Cys Thr * Asn Thr Phe Phe Asp Leu 
1820 1825 1830 

TAT TTT TTT TGT TAA TGC CAA ATT TTA TAC AGT AAA TCA ATA TGA ATG 3 840 

10 Tyr Phe Phe Cys * Cys Gin lie Leu Tyr Ser Lys Ser lie * Met 

1835 1840 1845 

TAA CTA TTT GTA TTT CAA ATG TAA TTT ATT TAT GAA ATG GTC GTA AGA 3888 

* Leu Phe Val Phe Gin Met * Phe lie Tyr Glu Met Val Val Arg 
15 1850 1855 i860 



20 



TTA CCT CGG GTG AAG AAT AAC TTA TTC TGC ACC CTG GGT GAT GAA TAG 3 93 6 
Leu Pro Arg Val Lys Asn Asn Leu Phe Cys Thr Leu Gly Asp Glu 
1865 1870 1875 1880 

TAA CAC TAT ATA TAT ATA TAT ATA TAT ATA TAT ATA TAT ATA CCG GCT 3 9 84 
* His Tyr He Tyr He Tyr He Tyr He Tyr He Tyr He Pro Ala 
1885 1890 1895 

25 GCT GCT AAT GAT GTT AAT ATT TCG CAA GTA CCT AAG CTG GAT TTT TCT 403 2 
Ala Ala Asn Asp Val Asn He Ser Gin Val Pro Lys Leu Asp Phe Ser 
1900 1905 1910 

CCA TGA GAC ATC AAT CCA TAA TTG AAA TTG GTC ACG ACA GTT GAA TAG 4080 
3 0 Pro * Asp He Asn Pro * Leu Lys Leu Val Thr Thr Val Glu * 
1915 1920 1925 

TTG ATA GCT GAA AAT GAA ATC CAG CAT GCT ACT GTC TTG CCA TCT CCA 4128 
Leu He Ala Glu Asn Glu He Gin His Ala Thr Val Leu Pro Ser Pro 
35 1930 1935 1940 

GAC TTG CTA ACA TGA ATT TTG TCT GCC TAC CTG TCA TTT GTA CCA ACG 4176 

Asp Leu Leu Thr * He Leu Ser Ala Tyr Leu Ser Phe Val Pro Thr 

1945 1950 1955 I960 

40 

TTC CCA ATT GCC CTC TCA TTA TTC GTG TGT ACC ATG CAT ATG TGT TTT 4224 

Phe Pro He Ala Leu Ser Leu Phe Val Cys Thr Met His Met Cys Phe 

1965 1970 1975 

45 AAC ATG ATT ATT GTT GGC TAT ATT TCT CTT TGG AAA CAT GAC TAA TTT 4272 
Asn Met He He Val Gly Tyr He Ser Leu Trp Lys His Asp * Phe 
1980 1985 1990 

ATC ACC CGT TTT GTA TAA ACT GCT TGT TTT CAT ATC AGG ATG AAC TTT 43 20 
50 He Thr Arg Phe Val * Thr Ala Cys Phe His He Arg Met Asn Phe 
1995 2000 2005 

TGG GGA TAT TCT ACC ATA AAC TTC TTT TCA CCA ATG ACG AGA TAC ACA 43 68 
Trp Gly Ty r Ser Thr He Asn Phe Phe Ser Pr o Met Thr Arg T yr Thr 
55 —2010 ~~2~0T5 202~0 _____ 



60 



TCA GGC GGG ATA AAA AAC TGT GGG CGT GAT GCC ATA AAT GAG TTC AAA 4416 

Ser Gly Gly He Lys Asn Cys Gly Arg Asp Ala He Asn Glu Phe Lys 
2025 2030 2035 2040 

ACT TTT GTA AGA GAG GCT CAC AAA CGG GGA ATT GAG GTA. AGC AAG TCG 4464 

Thr Phe Val Arg Glu Ala His Lys Arg Gly He Glu Val Ser Lys Ser 
2045 2050 2055 
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TAC GAG TTA GTT GCT CCT TTT GAA CTT ATC AAT TTG ATG CGA AGA CAT 4512 

Tyr Glu Leu Val Ala Pro Phe Glu Leu lie Asn Leu Met Arg Arg His 
2060 2065 2070 

5 GTT ACT GCT AGG TGA TCC TGG ATG TTG TCT TCA ACC ATA CAG CTG AGG 4560 
Val Thr Ala Arg * Ser Trp Met Leu Ser Ser Thr lie Gin Leu Arg 
2075 2080 2085 

GTA ATG AGA ATG GTC CAA TAT TAT CAT TTA GGG GGG TCG ATA ATA CTA 460 8 
10 Val Met Arg Met Val Gin Tyr Tyr His Leu Gly Gly Ser lie lie Leu 
2090 2095 2100 

CAT ACT ATA TGC TTG CAC CCA AGG TGA CAG ATC TTT CTT GCT GCG TAA 4656 
His Thr lie Cys Leu His Pro Arg * Gin lie Phe Leu Ala Ala * 
15 2105 2110 2115 2120 

TTG TTC TTT CAT AGA TGT ATA GAG CAT AGA TGT GTT ATG TAG TAG TTC 47 04 

Leu Phe Phe His Arg Cys He Glu His Arg Cys Val Met * * Phe 
2125 2130 2135 

20 

TTT TTC AAG GGG ATT ATG TTC ATG CAG GGA GAG TTT TAT AAC TAT TCT 47 52 

Phe Phe Lys Gly He Met Phe Met Gin Gly Glu Phe Tyr Asn Tyr Ser 
2140 2145 2150 

2 5 GGC TGT GGG AAT ACC TTC AAC TGT AAT CAT CCT GTG GTT . CGT CAA TTC 4 800 

Gly Cys Gly Asn Thr Phe Asn Cys Asn His Pro Val Val Arg Gin Phe 
2155 2160 2165 

ATT GTA GAT TGT TTA AGG TAC AGA TAT AC A TTT TAC TTC TAG AAC TAC 4848 

3 0 He Val Asp Cys Leu Arg Tyr Arg Tyr Thr Phe Tyr Phe * Asn Tyr 

2170 2175 2180 

TTT TTC ATT TCT TTT GCT GCT TGT CAT TTT GAT ATG ATT AAT TTG CAA 4 896 
Phe Phe lie Ser Phe Ala Ala Cys His Phe Asp Met He Asn Leu Gin 
35 2185 2190 2195 2200 

GCT TGT GGG GGT AAA TCT TTT GGT CAG CAT ATT GTA TCT TTA AAT GTC 49 44 

Ala Cys Gly Gly Lys Ser Phe Gly Gin His He Val Ser Leu Asn Val 
2205 ' 2210 2215 

40 

ACA AAT ACT AAT GTC CTG GTG CTT ATT GAT TTG GCA TCT TCA AAT TCT 49 92 

Thr Asn Thr Asn Val Leu Val Leu He Asp Leu Ala Ser Ser Asn Ser 
2220 2225 2230 

45 TCT CCA ATG AAA AGG GAA AAA TCT ACT GTA TGT CTC GTC AAC TAA TTT 5040 
Ser Pro Met Lys Arg Glu Lys Ser Thr Val Cys Leu Val Asn * Phe 
2235 2240 2245 

ACT TTT GTT TTG CAG ATA CTG GGT GAT GGA AAT GCA TGT TGA TGG TTT 50 88 
50 Thr Phe Val Leu Gin He Leu Gly Asp Gly Asn Ala Cys * Trp Phe 
2250 2255 2260 

TCG TTT TGA TCT TGC ATC CAT AAT GAC CAG AGG TTC CAG GTA ATT TGT 513 6 
Ser Phe * Ser Cys He His Asn Asp Gin Arg Phe Gin Val He Cys 
55 2265 2270 2275 2280 

ATT TAT TGT TTG TTT GCG TGT TGC CTT TTC AGA AGA TTC TTA AAA GAA 5184 
He Tyr Cys Leu Phe Ala Cys Cys Leu Phe Arg Arg Phe Leu Lys Glu 
2285 2290 2295 



60 



TGT TTC TTT TAC AAG TCT GTG GGA TCC AGT TAA CGT GTA TGG AGC TCC 52 3 2 
Cys Phe Phe Tyr Lys Ser Val Gly Ser Ser * Arg Val Trp Ser Ser 
2300 2305 2310 
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AAT AG A AGG TGA CAT GAT CAC AAC AGG GAC ACC TCT TGT TAC TCC ACC 52 80 

Asn Arg Arg * His Asp His Asn Arg Asp Thr Ser Cys Tyr Ser Thr 
2315 2320 2325 

5 ACT TAT TGA CAT GAT CAG CAA TGA CCC AAT TCT TGG AGG CGT CAA GGT 5328 

Thr Tyr * His Asp Gin Gin * Pro Asn Ser Trp Arg Arg Gin Gly 
2330* 2335 2340 

ACT TGT TTC ATC CAA CAC CTG TTG TCT GTG TGC ATT CAA TTG TTT TAA 5376 

10 Thr Cys Phe lie Gin His Leu Leu Ser Val Cys lie Gin Leu Phe * 

2345 2350 2355 2360 

TAT GGT AAT GAT CAA TTT CCC AAT GTT GAT AAG GAA AAA AAA TGC AAG 5424 

Tyr Gly Asn Asp Gin Phe Pro Asn Val Asp Lys Glu Lys Lys Cys Lys 
15 2365 2370 2375 



20 



TAG CTC TCT TTA TCT GCT TCT TGT GAG TTA TGC TAA AC A TGT AGA TAC 5472 
* Leu Ser Leu Ser Ala Ser Cys Glu Leu Cys * Thr Cys Arg Tyr 
2380 2385 2390 

TAC TAT ATT TCA ACT GTA TAT ACT TGA CAT ATT ATT GCT TCC TTG GGA 5520 
Tyr Tyr He Ser Thr Val Tyr Thr * His He He Ala Ser Leu Gly 
2395 2400 2405 

2 5 GGC TCT CTT ATT CCT TTC CCC CGT TGC AAT TAT AGC TCA TTG CTG AAG 5568 
Gly Ser Leu He Pro Phe Pro Arg Cys Asn Tyr Ser Ser Leu Leu Lys 
2410 2415 2420 

CAT GGG ATG CAG GAG GCC TCT ATC AAG TAG GTC AAT TCC CTC ACT GGA 5 616 
-3 0 His Gly Met Gin Glu Ala Ser He Lys * Val Asn Ser Leu Thr Gly 
2425 2430 2435 2440 

ATG TTT GGT CTG AGT GGA ATG GGA AGG TAA GGT ACC TGT TAA AAG TTT 5664 
Met Phe Gly Leu Ser Gly Met Gly Arg * Gly Thr Cys * Lys Phe 
35 2445 2450 2455 



40 



60 



GAA TGG CAA ATA CTG ATA GAA ATA TAA CTT ATA TTT GCG ACA TAT ATA 5712 
Glu Trp Gin He Leu He Glu He * Leu He Phe Ala Thr Tyr He 
2460 2465 2470 

GAT AAA GCA AAA TAA TAC GCA TTC CAC CTG AAC TTT AAA GGG GCA CGC 57 60 
Asp Lys Ala Lys * Tyr Ala Phe His Leu Asn Phe Lys Gly Ala Arg 
2475 2480 2485 



4 5 AGA ATT ATC CCG CAT CTG TCT ACA AGA ATG ATA ACA CAT GTG CTG AAT 
Arg He He Pro His Leu Ser Thr Arg Met He Thr His Val Leu Asn 
2490 2495 2500 



5808 



AGT GAA GTA CTA CTT CTC AAA TGT CTG AAT GAA CGC ACT AAC TCT TGT 5856 

50 Ser Glu Val Leu Leu Leu Lys Cys Leu Asn Glu Arg Thr Asn Ser Cys 
2505 2510 2515 2520 

GAG TGT CAA CCG AGC AAG AAA TAT TTG AGT TTT CTG CAA GAA ATT GTT 5904 

Glu Cy s Gin P ro Ser L ys Lys Tyr Leu Ser Phe t,<=m Gin Glu He Val 
55 2525 2530 2535 



CAT GTT GTG CTG TAT TAT ACT CCC TCC GTC CGA AAT TAT TTG TCG GAG 5952 
His Val Val Leu Tyr Tyr Thr Pro Ser Val Arg Asn Tyr Leu Ser Glu 
2540 2545 . 2550 

AAA TGG ATG TAT CTA GAC GTA TTT TAG TTC TAG ATA CAT CCA TTT TTA 6000 
Lys Trp Met Tyr Leu Asp Val Phe * Phe * He His Pro Phe Leu 
2555 2560 2565 



BNSOOCID: <WO 9914314A1_L> 



WO 99/14314 ^ PCT/AU98/00743 

- Ill - 

TCC ATT TCT GCA AC A AGT AGT TCC GGA CGG AGG GAG TAT CAT TTA ACA 6 04 8 

Ser lie Ser Ala Thr Ser Ser Ser Gly Arg Arg Glu Tyr His Leu Thr 
2570 2575 2580 

5 AAT ATA TGC ATG TTC GAA GTA AAT CCC CAC GAA TAA GCA TAT AAG ACG 609 6 
Asn lie Cys Met: Phe Glu Val Asn Pro His Glu * Ala Tyr Lys Thr 
2585 2590 2595 2600 

ATA TTG CTT TTT GAC TTG CAA CAC CTA AAC CTC ATT GTT TTC TCC TAG 6144 
10 lie Leu Leu Phe Asp Leu Gin His Leu Asn Leu lie Val Phe Ser * 

2605 2610 2615 

GAT TTT GGG TGT TCG AAG CAA GCA GCT GGT GAT ATT TAA TTT ACC TTT 619 2 
Asp Phe Gly Cys Ser Lys Gin Ala Ala Gly Asp He * Phe Thr Phe 
15 2620 2625 2630 

GCC TTT ATT TGT AGC TTG ATT TGA GGG TGC GGC AAA GGT TTT AGC TTA 6240 

Ala Phe He Cys Ser Leu He * Gly Cys Gly Lys Gly Phe Ser Leu 
2635 2640 2645 

20 

GTA GTG TTT TGT AAA TTA TTA TAG TTT ATG TAT ATA CTC CTC ATT TGG 62 8 8 

Val Val Phe Cys Lys Leu Leu * Phe Met Tyr He Leu Leu He Trp 
2650 2655 2660 

2 5 GCA CTT CCG TAC TGG TCC CAT AGA AGA TAA AAA TGG AAT GAT GTC TGG 63 3 6 

Ala Leu Pro Tyr Trp Ser His Arg Arg * Lys Trp Asn Asp Val Trp 
2665 2670 2675 2680 

CCA ATA ATT GTT GAC AAC ACT GTT GCG CAT TTG ATT TTT ATC AGG GAA 63 84 

3 0 Pro He He Val Asp Asn Thr Val Ala His Leu He Phe He Arg Glu 

2685 2690 2695 

TGG AAA ATT GAA ATC GGT AAG AAA CAT TGC GAT ATT AAG CTT GTA TAT 643 2 
Trp Lys He Glu He Gly Lys Lys His Cys Asp He Lys Leu Val Tyr 
35 2700 2705 2710 

GCT AAT GCT GGT GGA TCT TTA AGA GGG AAC ATA TGA TCT CGT GTG CAT 64 80 
Ala Asn Ala Gly Gly Ser Leu Arg Gly Asn He * Ser Arg Val His 
2715 2720 2725 

CCA TCT TCA ACT AAA AAA ATA TGT TGC ACA TCT CCC ACG TCA CTT ACT 652 8 
Pro Ser Ser Thr Lys Lys He Cys Cys Thr Ser Pro Thr Ser Leu Thr 
2730 2735 2740 

45 AGC TAT TTC ATC CAA GTA CTA ACT TGT GTG GTT GTC TCC TCA GTA CCG 657 6 
Ser Tyr Phe He Gin Val Leu Thr Cys Val Val Val Ser Ser Val Pro 
2745 2750 2755 2760 

GGA CAT TGT GCG CCA ATT CAT TAA AGG CAC TGA TGG ATT TGC TGG TGG 6624 
50 Gly His Cys Ala Pro He His * Arg His * Trp He Cys Trp Trp 

2765 2770 2775 

TTT TGC CGA ATG TCT TTG TGG AAG TCC ACA CCT ATA CCA GGT AAG TTG 667 2 
Phe Cys Arg Met Ser Leu Trp Lys Ser Thr Pro He Pro Gly Lys Leu 
55 2780 2785 2790 

TGG CAA TAC TTG GAA ATG GGT TGA GTG AAT GTC ACA TGG ATT TTT TAT 67 20 

Trp Gin Tyr Leu Glu Met Gly * Val Asn Val Thr Trp He Phe Tyr 
2795 2800 2805 

60 

ATA TAC CAC ATG ATG ATA CAC ATG TAA ATA TAT AAC GAT TAT AGT GTA 67 6 8 

He Tyr His Met Met He His Met * . He Tyr Asn Asp Tyr Ser Val 

2810 2815 2820 



40 
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TGC ATA TGC ATT TGG CTA AGA AGT ACT CCC TCC CTT AGT AAA AGT TAG 6816 

Cys lie Cys lie Trp Leu Arg Ser Thr Pro Ser Leu Ser Lys Ser * 
2825 2830 2835 2840 

5 TAC AAA GTT GAG TCA TCT ATT TTG GAA CGG AGG GAG TAT AAG TGT ATA 6864 
Tyr Lys Val Glu Ser Ser He Leu Glu Arg Arg Glu Tyr Lys Cys He 
. 2845 2850 2855 

CAC TAG TGC AAT ATA TAG GTT TTA ACA CCC AAC TTG CCA ATG AAG GAA 6912 
10 His * Cys Asn He * Val Leu Thr Pro Asn Leu Pro Met Lys Glu 
2860 2865 2870 

CAT AGG GCT TTC TAG TTA TCT TAT TTA TTT GTC TGG TGA ATA ATC CAC 6960 
His Arg Ala Phe * Leu Ser Tyr Leu Phe Val Trp * He He His 
15 2875 2880 2885 

TGA AAA ATT CCA GCC ATG TCA TTT TTT AGG GGG GGA GAA GAA ACT ACA 7 008 

* Lys He Pro Ala Met Ser Phe Phe Arg Gly Gly Glu Glu Thr Thr 
2890 2895 2900 

TTG ATT TTT CCC CCT AAA AAA AGC CAT CTC AGA TTT CAT AGG TAA CTT 7 056 
Leu He Phe Pro Pro Lys Lys Ser His Leu Arg Phe His Arg * Leu 
2905 2910 2915 2920 

2 5 GCT TTT CTG TAA AGA AAT GAA AAC GAC TTC ATA CTT TCT GTC GAT TAT 7104 

Ala Phe Leu * Arg Asn Glu Asn Asp Phe He Leu Ser Val Asp Tyr 
2925 2930 2935 

AAG TGT ATA CAC TAG TGC AAT ATA TAG GTT TTA ACA CCC AAC TTG CCA 7152 

3 0 Lys Cys He His * Cys Asn He * Val Leu Thr Pro Asn Leu Pro 

2940 2945 2950 

ATG AAG GAA CAT AGG GCT TTC TAG TTA TCT TAT TTA TTT GCT GGT GAA 7 200 
Met Lys Glu His Arg Ala Phe * Leu Ser Tyr Leu Phe Ala Gly Glu 
35 2955 2960 2965 

TAA TCC ACT GAA AAA TTC CAG CCA TGT CAT TTT TTA GGG GGG AGA AGA 72 4 8 

* Ser Thr Glu Lys Phe Gin Pro Cys His Phe Leu Gly Gly Arq Arq 
2970 2975 2980 

AAC TAT ATT GAT TTT TCC CCC TAA AAA AAG CCA TCT CAG ATT CAT AGG 72 9 6 
Asn Tyr He Asp Phe Ser Pro * Lys Lys Pro Ser Gin He His Arg 
2985 2990 2995 3000 

45 AAC TTG CTT TTC TGT AAA GAA ATG AAA ACG ACT TCA TAC TTT CTG CGG 7 3 44 
Asn Leu Leu Phe Cys Lys Glu Met Lys Thr Thr Ser Tyr Phe Leu Arg 
3005 3010 3015 

CGC TTA CTT AGC TCG ATG GAT ATT TGT AAG ATG AAT GCC AAA TTA TTT 73 92 
Arg Leu Leu Ser Ser Met Asp He Cys Lys Met Asn Ala Lys Leu Phe 
3020 3025 3030 

GGC GGG ATT TGA TCG TTA TTC CAA ATT TCA TTT GGT TTC TCT AGC AAT 7 440 
Gly Gly H e * Ser Leu Ph e Gin lie Ser Phe Gly Phe fi^r s er Asn — 
bb 3035 3040 .3045 



40 



50 



60 



CAA CCC AGT ACC TTG TTA TTG GCA CTG CAA TTT CTT ATT GAT TAA TCA 7488 
Gin Pro Ser Thr Leu Leu Leu Ala Leu Gin Phe Leu He Asp * Ser 
3050 3055 3060 

GGC AGG AGG AAG GAA ACC TTG GCA CAG TAT CAA CTT GGT ATG TGC ACA 7536 
Gly Arg Arg Lys Glu Thr Leu Ala Gin Tyr Gin Leu Gly Met Cys Thr 
3065 ' 3070 3075 3080 
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TGA TGG ATT TAC ACT GGG TGA TTT GGT ACA TAT AAT ACC AAG TCA ATT 7 584 
* Trp lie Tyr Thr Gly * Phe Gly Thr Tyr Asn Thr Lys Ser lie 
3085 3090 3095 

5 TAC CAA ATG GGG AGA CCA ATA GAG ATG GAG AAA ATC ACA ATC TTA GCT 7632 
Tyr Gin Met Gly Arg Pro lie Glu Met Glu Lys lie Thr lie Leu Ala 
3100 3105 3110 

GGA ATT GTG GGG AGG TAA TTC TGA ACT CTC CTT TTT TTT TGA AAT TTT 7680 
10 Gly lie Val Gly Arg * Phe * Thr Leu Leu Phe Phe * Asn Phe 
3115 3120 3125 

CAT GCT TTA CAT AAT AGT CAA ATG GCT GAC AAA TGT CGT TGT ATG GTT 772 8 
His Ala Leu His Asn Ser Gin Met Ala Asp Lys Cys Arg Cys Met Val 
15 3130 3135 3140 

CTC TCT ACC TAA ACC GTT AAG GCA GTA AGA GTT TCC CTA CAA GAT CTC 777 6 
Leu Ser Thr * Thr Val Lys Ala Val Arg Val Ser Leu Gin Asp Leu 
3145 3150 3155 3160 



20 



60 



TTT GTT CGT ATA ATT GTA TTT TCT AGA GAA AAG TTG CCT TCA ATT TTG 7 82 4 
Phe Val Arg lie He Val Phe Ser Arg Glu Lys Leu Pro Ser He Leu 
3165 3170 3175 



2 5 TGC ACG CGG CAG TAC AGG AAT TGT GGT TAT AAA TAT TGA TAC AGG CTG 7 87 2 

Cys Thr Arg Gin Tyr Arg Asn Cys Gly Tyr Lys Tyr * Tyr Arg Leu 
3180 3185 3190 

ACC ATC GTT ACT AAT AGG GGG AAC AAT AAG CAC ATT TTT TTA ATA GCA 7 92 0 

3 0 Thr He Val Thr Asn Arg Gly Asn Asn Lys His He Phe Leu He Ala 

3195 3200 3205 

AAG GCA TCA CCC TTG TTC CGT TTC CAA TGA AAT CAC AGT ATC CGA ACC 79 6 8 
Lys Ala Ser Pro Leu Phe Arg Phe Gin * Asn His Ser He Arg Thr 
35 3210 3215 3220 

ATA AGT TTT ACA AGT ATG CGT AGA GAG AAA TAA AGT ATC AAC CCG GCA 8 016 

He Ser Phe Thr Ser Met Arg Arg Glu Lys * Ser He Asn Pro Ala 
3225 3230 3235 3240 

40 

GAA ACA GTT GTT TCA GGC GCA AAG AGA AAA GGA AAC GAT ATG CTC TAT 8064 

Glu Thr Val Val Ser Gly Ala Lys Arg Lys Gly Asn Asp Met Leu Tyr 
3245 3250 3255 

45 TAC ATC AAC CTT TTA GCA TTT AGG GAC GAC CAG CAT CAT CCC ATC TTC 8112 
Tyr He Asn Leu Leu Ala Phe Arg Asp Asp Gin His His Pro He Phe 
3260 3265 3270 • 

AAT CAA CTG GAG CGA GGT CAC CTC CAA TCT TCT CAG CAG CCT CAG AGT 8160 
50 Asn Gin Leu Glu Arg Gly His Leu Gin Ser Ser Gin Gin Pro Gin Ser 
3275 3280 3285 

GGT GAC CTC CCA AGC AAG TGC ATC AGC ATC CAT CAT CTG GGG GTT GGG 8208 
Gly Asp Leu Pro Ser Lys Cys He Ser He His His Leu Gly Val Gly 
55 3290 3295 3300 

CAC ATA CCA TGA GCA CAA TCA CCT GAA TTT GAT GAA TTT TCC- TCT GTT 82 5 6 
His He Pro * Ala Gin Ser Pro Glu Phe Asp Glu Phe Ser Ser Val 
3305 3310 3315 3320 



TAC CTT GCA GCA GAC CCC TGC CGT ATA AAT GGT TTT AAA TGA CAG CAT 8304 
Tyr Leu Ala Ala Asp Pro Cys Arg He Asn Gly Phe Lys * Gin His 
3325 3330 3335 
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10 



20 



40 



60 



GTT CTT TCA GTT TGA GCA AAA TTT GTG CAA TTG CAA AG A AGC TTT AG A 8352 

Val Leu Ser Val * Ala Lys Phe Val Gin Leu Gin Arg Ser Phe Arg 
3340 3345 . 3350 

ATC ATG TGG AAC ATG CAC TTA CAT TTC ATC TGA CAA TAT AGG AAG GAG 8400 

lie Met Trp Asn Met His Leu His Phe lie * Gin Tyr Arg Lys Glu 
3355 3360 3365 

AGC CCG ACG TCG CAT GCT CCT CTA GAC TCG AGG AAT TCG CAA GAT TGT 844 8 

Ser Pro Thr Ser His Ala Pro Leu Asp Ser Arg Asn Ser Gin Asp Cys 

3370 3375 3380 



CTG TCA AAA GAT TGA GGA AGA GGC AGA TGC GCA ATT TCT TTG TTT GTC 8496 
Leu Ser Lys Asp * Gly Arg Gly Arg Cys Ala He Ser Leu Phe Val 
15 3385 3390 3395 3400 



TCA TGG TTT CTC AAG TAA GAC TTA TAT CTG ATC TCT TCA ATT TTT GAG 8544 

Ser Trp Phe Leu Lys * Asp Leu Tyr Leu He Ser Ser lie Phe Glu 
3405 3410 3415 

ATT GCC TGT TTT TCA CAA TGG CAT ATG TTG TCA GGT GAA ACA TCC AAT 859 2 

lie Ala Cys Phe Ser Gin Trp His Met Leu Ser Gly Glu Thr Ser Asn 
3420 3425 3430 



2 5 CCC AGT ATT AAT AGA GCC AAC ATG AAG GGA TTG CTT ATC TGA GAT ATC 
Pro Ser He Asn Arg Ala Asn Met Lys Gly Leu Leu He * Asp He 
3435 3440 3445 



45 CTA AGT AAC TAT CTT CAA ATC TTT GCA TTC ATC CGT CAT GGC TCT TCT 
Leu Ser Asn Tyr Leu Gin He Phe Ala Phe He Arg His Gly Ser Ser 
3515 3520 3525 



8640 



TGC CAA AGT TGA ATT CTT AGA TTC ACC TTC TTC AGT ATT TCA GAC CTT 868 8 

3 0 Cys Gin Ser * He Leu Arg Phe Thr Phe Phe Ser He Ser Asp Leu 
3450 3455 3460 

CTA AGC ATT TTC ATT TTT TTT TTC AAT TGT TAG GGA GTT CCA ATG TTT 87 3 6 

Leu Ser He Phe He Phe Phe Phe Asn Cys * Gly Val Pro Met Phe 
35 3465 3470 3475 3480 

TAC ATG GGC GAT GAA TAT GGC CAC ACA AAA GGG GGC AAC AAC AAT ACA 87 84 

Tyr Met Gly Asp Glu Tyr Gly His Thr Lys Gly Gly Asn Asn Asn Thr 
3485 3490 3495 



TAC TGC CAT GAT TCT TAT GTC AGT ACA ATT TGG TCA CAT ATT GTT GTT 883 2 
Tyr Cys His Asp Ser Tyr Val Ser Thr He Trp Ser His He Val Val 
3500 3505 3510 



8880 



GTA GGT CAA TTA TTT TCG CTG GGA TAA AAA AGA ACA ATA CTC TGA CTT 892 8 
50 Val Gly Gin Leu Phe Ser Leu Gly * Lys Arg Thr He Leu * Leu 
3530 3535 3540 

GCA AAG ATT CTG CTG CCT CAT GAC CAA ATT CCG CAA GTA AGT ATT CCG 8976 

$tk*L. JLys_ I le Le ja^^ej^Pro_His_Asp„Gln„Iie„Pro^Gln^Val- -Ser— Lie-Pro 

55 3545 3550 3555 3560 

TTG AAT AAT TTC TGT GTA GAA CCA CTG AAG GTG CCT CCA AAC GCT AAG 9 024 
Leu Asn Asn Phe. Cys Val Glu Pro Leu Lys Val Pro Pro. Asn Ala Lys 
3565 3570 3575 



CGA GCA AGG TCA ATT TCA CAC CCT AAT CAA GTT GGT GTT GTC TAT TTG 9072 
Arg Ala Arg Ser He Ser His Pro Asn Gin Val Gly Val Val Tyr Leu 
3580 3585 3590 
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TGT ATT TGA TCT GCT GCA CTG TAG GGA GTG CGA GGG TCT TGG CCT TGA 912 0 
Cys He * Ser Ala Ala Leu * Gly Val Arg Gly Ser Trp Pro * 
3595 3600 3605 

5 GGA CTT TCC AAC GGC CGA ACG GCT GCA GTG GCA TGG TCA TCA GCC TGG 916 8 
Gly Leu Ser Asn Gly Arg Thr Ala Ala Val Ala Trp Ser Ser Ala Trp 
3610 3615 3620 

GAA GCC TGA TTG GTC TGA GAA TAG CCG ATT CGT TGC CTT TTC CAT GGT 9216 
10 Glu Ala * Leu Val * Glu * Pro He Arg Cys Leu Phe His Gly 
3625 3630 3635 3640 

ACA CAT ATA GTT CTG ACA CTT CAC TAT AGT TGT TTT AAA AAA GAA AAT 9 2 64 
Thr His He Val Leu Thr Leu His Tyr Ser Cys Phe Lys Lys Glu Asn 
15 3645 3650 3655 

TTA ACT CAA AAG TAA ATT ATG GAG A 9 2 89 

Leu Thr Gin Lys * He Met Glu 
3660 

20 



BNSDOCID: <WO 9914314A1J_> 



WO 99/14314 



PCT/AU98/00743 



10 



- 116 - 

CLAIMS 

!* A nucleic acid sequence encoding an enzyme of the 

starch biosynthetic pathway in a cereal plant, wherein the 
enzyme is selected from the group consisting of starch 
branching enzyme I, starch branching enzyme II, starch 
soluble synthase I, and debranching enzyme, with the proviso 
that the enzyme is not soluble starch synthase I of rice, or 
starch branching enzyme I of rice or maize. 

2 - A sequence according to claim 1, wherein the 
sequence is a genomic DNA or cDNA sequence. 

3 - A sequence according to claim 1 or claim 2, 
15 wherein the sequence is functional in wheat. 

4 - A sequence according to any one of claims 1 to 3 , 
wherein the sequence is derived from a Triticxun species. 

20 5 • A sequence according to claim 4, wherein the 

Triticum species is Triticum tauschii. 

6 - A sequence according to any one of claims 1 to 5, 

wherein the sequence encodes starch branching enzyme I or a 
25 biologically-active fragment thereof, and wherein the 
sequence has at least 70% sequence homology with the 
sequence shown in SEQ ID NO: 5 or SEQ ID NO: 9. 

7 • A sequence according to claim 6, wherein the 

30 homology is at least 90%. 



"~ 8 ~- -™A-sequ-en'ce^ac'cardin"g~to~any ~l"to~5^ 

wherein the sequence encodes starch branching enzyme II a or 
biologically-active fragment thereof, and wherein the 
3 5 sequence has at least 70% sequence homology with the 
sequence shown in SEQ ID NO: 10. 
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9. A sequence according to claim 8, wherein the 
homology is at least 90%. 

10. A sequence according to any one of claims 1 to 5 , 
5 wherein the sequence encodes soluble starch synthase or a 

biologically-active fragment thereof, and wherein the 
sequence has at least 70% sequence homology with the 
sequence shown in SEQ ID NO:ll or SEQ ID NO: 13. 

10 11. A sequence according to claim 10, wherein the 

homology is at least 90%. 

12. A sequence according to claim 11, wherein the 
sequence encodes a 75 kD soluble starch synthase of wheat. 

15 

13. A sequence according to claim 12, which encodes an 
amino acid sequence at least 70% homologous to that shown in 
SEQ ID NO: 14 . 

20 14. A sequence according to any one of claims 1 to 5 , 

wherein the sequence encodes debranching enzyme or a 
biologically-active fragment thereof, and wherein the 
sequence has at least 7 0% sequence homology with the 
sequence shown in SEQ ID No: 17. 

25 

15. A sequence according to claim 14, wherein the 
homology is at least 90%. 

16. A promoter of an enzyme selected from the group 
30 consisting of starch branching enzyme I, starch branching 

enzyme II, starch soluble synthase I, and debranching 
enzyme, with the proviso that the enzyme is not soluble 
starch synthase I of rice, or starch branching enzyme I of 
rice or maize. 

35 

17. A promoter according to claim 16, wherein the 
promoter is a starch branching enzyme I promoter or 
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biologically-active fragment thereof, and wherein the 
promoter sequence has at least 70% sequence homology with 
the sequence shown in SEQ ID No: 8. 

5 18. A sequence according to claim 17, wherein the 

homology is at least 90%. 

A promoter according to claim 16, wherein the 
promoter is a starch soluble synthase I promoter or 
10 biologically-active fragment thereof, and wherein the 

promoter sequence has at least 70% sequence homology with 
the sequence shown in SEQ ID No: 15. 

20 • A sequence according to claim 19, wherein the 
15 homology is at least 90%. 

21 • A nucleic acid construct comprising a nucleic acid 
sequence encoding an enzyme of the starch biosynthetic 
pathway in a cereal plant, operably linked to one or more 

20 nucleic acid sequences facilitating expression of the 

nucleic acid sequence in a plant, wherein the enzyme is 
selected from the group consisting of starch branching 
enzyme I, starch branching enzyme II, starch soluble 
synthase I, and debranching . enzyme, with the proviso that 

25 ' the enzyme is not soluble starch synthase I of rice, or 

starch branching enzyme I of rice or maize, a biologically- 
active fragment thereof. 



30 



22 • A nucleic acid construct for targeting a gene to 

the endosperm of a cereal plant, comprising one or more 
promoter sequences selected from the group consisting of 

— ~ -SBE— I— promoter— SBE~IiIr~promoter~ — SSS"I~ promoter r~and ~ 

DBE promoter, operatively linked to a nucleic acid sequence 
encoding a protein, wherein the expression of the targetted 
3 5 gene in the endosperm of a cereal plant is modified. 
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23. A construct according to either claim 21 or claim 
22, wherein the promoter or nucleic acid sequence is also 
operatively linked to one or more additional targeting 
sequences and/or one or more 3' untranslated sequences. 

5 

24. A construct according to claim 23, wherein the 
nucleic acid encoding the protein is either in the sense or 
antisense orientation. 

10 25. A construct according to claims 24, wherein the 

protein is an enzyme of the starch biosynthetic pathway. 

26. A construct according to claim 25, wherein the 
nucleic acid encoding the protein is in the antisense 

15 orientation, and the enzyme is selected from the group 

consisting of GBSS, starch debranching enzyme, SBE II, low 
molecular weight glutenin, and grain softness protein I. 

27. A construct according to claim 25, wherein the 
20 nucleic acid encoding the protein is in the sense 

orientation, and the enzyme is selected from the group 
consisting of bacterial isoamylase, bacterial glycogen 
synthase, and wheat high molecular weight glutenin Bxl7 . 

28. A construct according to any one of claims 21 to 
25 27, wherein the plant is a cereal plant. 

29. A construct according to claim 28, wherein the 
cereal plant is either wheat or barley. 

30 30. A construct according to claim 29, wherein the 

cereal plant is wheat. 

31. a construct according to any one of claims 21 to 

30. wherein the construct is either a plasmid or a vector. 
35 ■ 
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32 • A construct according to claim 31, wherein the 
plasmid or vector is suitable for use in the transformation 
of a plant . 

33 • A construct according to claim 32, wherein the 
plasmid is selected from the group consisting of those 
depicted in Figures 22a to 22f. 

34 * A construct according to claim 32, wherein the 
vector is a bacterium of the genus Agrobacterium. 

35 • A construct according to claim 34, wherein the 
vector is Agrobacterium tumefaciens. 

15 36 • A method of modifying the characteristics of 

starch produced by a plant, comprising the steps of: 

(a) introducing a nucleic acid sequence encoding 
an enzyme of the starch biosynthetic pathway into a host 
plant , and/or 

20 (b) introducing an anti-sense nucleic acid 

sequence directed to a gene encoding an enzyme of the starch 
biosynthetic pathway into a host plant, 

wherein the enzyme is selected from the group 
consisting of starch branching enzyme I, starch branching 

25 enzyme II, starch soluble synthase I, and debranching 

enzyme, with the proviso that the enzyme is not soluble 
starch synthase I of rice, or starch branching enzyme I of 
rice or maize, and wherein if both steps (a) and (b) are 
used, the enzymes in the two steps are different. 



30 



37 • A method according to claim 36, wherein the plant 
_i.s_a— cereal-plant-.— — — — — ___ - — — — 



38 • A method according to claim 37, wherein the cereal 

35 plant is wheat or barley. 
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39. a method of targeting expression of a gene to the 
endosperm of a cereal plant, comprising the step of 
transforming the plant with a construct according to any one 
of claims 21 to 35. 

5 

40. A method of modulating the time of expression of a 
gene in endosperm of a cereal plant, comprising the step of 
transforming the plant with a construct according to any one 
of claims 21 to 35. 

10 

41. a method according to claim 40, wherein when 
expression at an early stage following anthesis is desired, 
the construct comprises either the SBE II, SSS I, or DBE 
promoter . 

15 

42. A method according to claim 40, wherein when 
expression at a later stage following anthesis is desired, 
the construct comprises the SBE I promoter. . 

20 43. A plant transformed with a construct according to 

any one of claims 21 to 35. 

44. a plant according to claim 43, wherein the plant 
is a cereal plant . 

25 

45. a plant according to claim 44, wherein the cereal 
plant is wheat or barley. 

46. A method of identifying variations in the starch 
30 synthesis characteristics of a cereal plant, comprising the 

step of identifying a variation in nucleic acid sequence in 
the intron regions of the SBE I, SBE II, SSS I or DBE genes. 

47. A method of identifying variations in the starch 
35 synthesis characteristics of a cereal plant, comprising the 

step of identifying a variation in nucleic acid sequence 
compared to the sequence shown in one or more SEQ ID NO: 5, 
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SEQ ID NO: 7, SEQ ID NO : 9 , SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID 
NO: 13, SEQ ID NO: 15, SEQ ID NO: 16, or SEQ ID NO: 17. 

48. A method according to claim 47, in which a 

5 mutation or absence of a SBE I, SBE II, SSS I or DBE gene is 
detected . 

49. A method according to either claim 47 or claim 48, 
in which the cereal plant is wheat or barley. 

10 50. A product comprising plant material propogated 

from a plant transformed with a nucleic acid sequence 
encoding an enzyme of the starch biosynthetic pathway in a 
cereal plant, operably linked to one or more nucleic acid 
sequences facilitating expression of the nucleic acid 

15 sequence in a plant, wherein the enzyme is selected from the 
group consisting of starch branching enzyme I, starch 
branching enzyme II, starch soluble synthase I, and 
debranching enzyme, with the proviso that the enzyme is not 
soluble starch synthase I of rice, or starch branching 

20 enzyme I of rice or maize, a biologically-active fragment 
thereof . 

51. A product comprising plant material propogated 

from a plant in which a gene was targeted to the endosperm 
of a cereal plant, by a nucleic acid construct comprising 

25 one or more promoter sequences selected from the group 
consisting of SBE I promoter, SBE II promoter, SSS I 
promoter, and DBE promoter, operatively linked to a nucleic 
acid sequence encoding a protein, wherein the expression of 
the targetted gene in the endosperm of a cereal plant is 

3 0 modified. 

. — — 52— A^produc^acco^ding^to-cl-a"im--50— or™ claim-51 

wherein the product is a " food product. 
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FIGURE 7 
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Figure 16 
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Figure 20b 
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Primer 
Set 


Key 


Forward 
Primer 


Forward Primer Sequence 


1 


E01 ' /E02 


WBE2E1F 


CGT CGC TGC TCC TCA GGA AG 


2 


E01/E02 


sr854 . 1180F 


CTG GCT GAC TCA ATC ACT ACG 


3 


E02/E03 


WBE2E2F 


CGC AAC CTG AAG AAT TAC AG 


4 


E03/E04 


WBE2E3F 


ATT TTC GGA GCC ATC TTG AC ' 


5 


E04/E05 


WBE2E4F 


TCG TGG TTA TGA AAA GCT TGG 


6 


E05/E06 


sr913F 


ATC ACT TAC CGA GAA TGG G 


7 


E05/I05 


sr913F 


ATC ACT TAC CGA GAA TGG G 


8 


E06/E07 


WBE2E6F 


ACA ATT GGA ATC CAA ATG CA 


9 


E07/E08 


WBE2E7F 


AGC TAT TCC TCA TGG CTC AC 


10 


E08/E09 


WBE2E8F 


TGC AGG CTC CAG GTG AAA TA 


11 


E10/E11 


da5 . seq 


GGC TTG GAT ACA ATG CAG TGC 


12 


E12/E13 


dal51 . seq 


TTG ACG GCT TGA ATG GTT TC 


13 


E17/E18 


WBE2E17F 


TTT AGG TGG TGA AGG CTA TCT 


14 


E18/E19 


sr860R 


AAT GGA TAG ATT TTC CAA GAG G 


15 


E19_3 ' 


WBE2-2395F 


AGC AGA ACT GCG GTC GTG TA 



Reverse 
Primer 


Reverse Primer Sequence 


Temp 


bp 


WBE2E2R 


CAG GAC CTT CCC TGG AGA GG 


57.4 


401 


WSBE9E2R 


GGC ACG AGT GTG TGT ACC TGT A 


57 . 7 


601 


sr866F 


TAT CTT CAG GTA TCT ACA GC 


49 . 8 


309 


WBE2E4R2 


ATG CTT CCA ATC CAC CTT CA 




>450 


WBE2E5R 


GAG CCC ATT CTC GGT AAG TGA 


50 . 5 


234 


WBE2E6R 


CTG CAT TTG GAT TCC AAT TG 


49 . 9 


232 


WBE2I5R 


CAG TAA GCT AGT TGG TGA ATA 


46 . 6 


106 


WBE2E7R 


GGG AGG AAA ATC TCC CAA AC 


51 . 0 


402 


sr915F 


CCA TTG AAA GGT ATT TCA CC 


51 . 1 


203 


sr912F 


TAA CTT ATT GAC ATA CCG G 


48.4 


439 


WBE2E11R 


CTG GAG TTC CAA AAC GGC TAC 


51.2 


289 


WBE2E13R 


ATT CTT CAA GCC ACC ATC TC 


51 . 6 


244 


WBE2E18R 


TAT TGT TAT TTC CAG GGG AGA 


50.2 


258 


da23 . seq 


TGC TGC ATT GCC TGA TCG AA 


50.4 


-295 


WBE2-2634R 


AAC ACC CAG GCC CGT CCA TT 


57 .2 


240 
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SBE II Intron 5 primer set - digested with Dde1 
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SBE II Intron 10 primer set - digested with Dde1 
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